-This is ../info/internals.info, produced by makeinfo version 4.0 from
+This is ../info/internals.info, produced by makeinfo version 4.6 from
internals/internals.texi.
INFO-DIR-SECTION XEmacs Editor
END-INFO-DIR-ENTRY
Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun
-Microsystems. Copyright (C) 1994 - 1998 Free Software Foundation.
-Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
+Microsystems. Copyright (C) 1994 - 1998, 2002, 2003 Free Software
+Foundation. Copyright (C) 1994, 1995 Board of Trustees, University of
+Illinois.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
Foundation instead of in the original English.
\1f
-File: internals.info, Node: The XEmacs Object System (Abstractly Speaking), Next: How Lisp Objects Are Represented in C, Prev: XEmacs From the Inside, Up: Top
-
-The XEmacs Object System (Abstractly Speaking)
-**********************************************
-
- At the heart of the Lisp interpreter is its management of objects.
-XEmacs Lisp contains many built-in objects, some of which are simple
-and others of which can be very complex; and some of which are very
-common, and others of which are rarely used or are only used
-internally. (Since the Lisp allocation system, with its automatic
-reclamation of unused storage, is so much more convenient than
-`malloc()' and `free()', the C code makes extensive use of it in its
-internal operations.)
-
- The basic Lisp objects are
-
-`integer'
- 28 or 31 bits of precision, or 60 or 63 bits on 64-bit machines;
- the reason for this is described below when the internal Lisp
- object representation is described.
-
-`float'
- Same precision as a double in C.
-
-`cons'
- A simple container for two Lisp objects, used to implement lists
- and most other data structures in Lisp.
-
-`char'
- An object representing a single character of text; chars behave
- like integers in many ways but are logically considered text
- rather than numbers and have a different read syntax. (the read
- syntax for a char contains the char itself or some textual
- encoding of it--for example, a Japanese Kanji character might be
- encoded as `^[$(B#&^[(B' using the ISO-2022 encoding
- standard--rather than the numerical representation of the char;
- this way, if the mapping between chars and integers changes, which
- is quite possible for Kanji characters and other extended
- characters, the same character will still be created. Note that
- some primitives confuse chars and integers. The worst culprit is
- `eq', which makes a special exception and considers a char to be
- `eq' to its integer equivalent, even though in no other case are
- objects of two different types `eq'. The reason for this
- monstrosity is compatibility with existing code; the separation of
- char from integer came fairly recently.)
-
-`symbol'
- An object that contains Lisp objects and is referred to by name;
- symbols are used to implement variables and named functions and to
- provide the equivalent of preprocessor constants in C.
-
-`vector'
- A one-dimensional array of Lisp objects providing constant-time
- access to any of the objects; access to an arbitrary object in a
- vector is faster than for lists, but the operations that can be
- done on a vector are more limited.
-
-`string'
- Self-explanatory; behaves much like a vector of chars but has a
- different read syntax and is stored and manipulated more compactly.
-
-`bit-vector'
- A vector of bits; similar to a string in spirit.
-
-`compiled-function'
- An object containing compiled Lisp code, known as "byte code".
-
-`subr'
- A Lisp primitive, i.e. a Lisp-callable function implemented in C.
-
- Note that there is no basic "function" type, as in more powerful
-versions of Lisp (where it's called a "closure"). XEmacs Lisp does not
-provide the closure semantics implemented by Common Lisp and Scheme.
-The guts of a function in XEmacs Lisp are represented in one of four
-ways: a symbol specifying another function (when one function is an
-alias for another), a list (whose first element must be the symbol
-`lambda') containing the function's source code, a compiled-function
-object, or a subr object. (In other words, given a symbol specifying
-the name of a function, calling `symbol-function' to retrieve the
-contents of the symbol's function cell will return one of these types
-of objects.)
-
- XEmacs Lisp also contains numerous specialized objects used to
-implement the editor:
+File: internals.info, Node: Introduction to Symbols, Next: Obarrays, Up: Symbols and Variables
-`buffer'
- Stores text like a string, but is optimized for insertion and
- deletion and has certain other properties that can be set.
+Introduction to Symbols
+=======================
-`frame'
- An object with various properties whose displayable representation
- is a "window" in window-system parlance.
-
-`window'
- A section of a frame that displays the contents of a buffer; often
- called a "pane" in window-system parlance.
-
-`window-configuration'
- An object that represents a saved configuration of windows in a
- frame.
-
-`device'
- An object representing a screen on which frames can be displayed;
- equivalent to a "display" in the X Window System and a "TTY" in
- character mode.
-
-`face'
- An object specifying the appearance of text or graphics; it has
- properties such as font, foreground color, and background color.
-
-`marker'
- An object that refers to a particular position in a buffer and
- moves around as text is inserted and deleted to stay in the same
- relative position to the text around it.
-
-`extent'
- Similar to a marker but covers a range of text in a buffer; can
- also specify properties of the text, such as a face in which the
- text is to be displayed, whether the text is invisible or
- unmodifiable, etc.
-
-`event'
- Generated by calling `next-event' and contains information
- describing a particular event happening in the system, such as the
- user pressing a key or a process terminating.
-
-`keymap'
- An object that maps from events (described using lists, vectors,
- and symbols rather than with an event object because the mapping
- is for classes of events, rather than individual events) to
- functions to execute or other events to recursively look up; the
- functions are described by name, using a symbol, or using lists to
- specify the function's code.
-
-`glyph'
- An object that describes the appearance of an image (e.g. pixmap)
- on the screen; glyphs can be attached to the beginning or end of
- extents and in some future version of XEmacs will be able to be
- inserted directly into a buffer.
-
-`process'
- An object that describes a connection to an externally-running
- process.
-
- There are some other, less-commonly-encountered general objects:
-
-`hash-table'
- An object that maps from an arbitrary Lisp object to another
- arbitrary Lisp object, using hashing for fast lookup.
-
-`obarray'
- A limited form of hash-table that maps from strings to symbols;
- obarrays are used to look up a symbol given its name and are not
- actually their own object type but are kludgily represented using
- vectors with hidden fields (this representation derives from GNU
- Emacs).
-
-`specifier'
- A complex object used to specify the value of a display property; a
- default value is given and different values can be specified for
- particular frames, buffers, windows, devices, or classes of device.
-
-`char-table'
- An object that maps from chars or classes of chars to arbitrary
- Lisp objects; internally char tables use a complex nested-vector
- representation that is optimized to the way characters are
- represented as integers.
-
-`range-table'
- An object that maps from ranges of integers to arbitrary Lisp
- objects.
-
- And some strange special-purpose objects:
-
-`charset'
-`coding-system'
- Objects used when MULE, or multi-lingual/Asian-language, support is
- enabled.
-
-`color-instance'
-`font-instance'
-`image-instance'
- An object that encapsulates a window-system resource; instances are
- mostly used internally but are exposed on the Lisp level for
- cleanness of the specifier model and because it's occasionally
- useful for Lisp program to create or query the properties of
- instances.
-
-`subwindow'
- An object that encapsulate a "subwindow" resource, i.e. a
- window-system child window that is drawn into by an external
- process; this object should be integrated into the glyph system
- but isn't yet, and may change form when this is done.
-
-`tooltalk-message'
-`tooltalk-pattern'
- Objects that represent resources used in the ToolTalk interprocess
- communication protocol.
-
-`toolbar-button'
- An object used in conjunction with the toolbar.
-
- And objects that are only used internally:
-
-`opaque'
- A generic object for encapsulating arbitrary memory; this allows
- you the generality of `malloc()' and the convenience of the Lisp
- object system.
-
-`lstream'
- A buffering I/O stream, used to provide a unified interface to
- anything that can accept output or provide input, such as a file
- descriptor, a stdio stream, a chunk of memory, a Lisp buffer, a
- Lisp string, etc.; it's a Lisp object to make its memory
- management more convenient.
-
-`char-table-entry'
- Subsidiary objects in the internal char-table representation.
-
-`extent-auxiliary'
-`menubar-data'
-`toolbar-data'
- Various special-purpose objects that are basically just used to
- encapsulate memory for particular subsystems, similar to the more
- general "opaque" object.
-
-`symbol-value-forward'
-`symbol-value-buffer-local'
-`symbol-value-varalias'
-`symbol-value-lisp-magic'
- Special internal-only objects that are placed in the value cell of
- a symbol to indicate that there is something special with this
- variable - e.g. it has no value, it mirrors another variable, or
- it mirrors some C variable; there is really only one kind of
- object, called a "symbol-value-magic", but it is sort-of halfway
- kludged into semi-different object types.
+A symbol is basically just an object with four fields: a name (a
+string), a value (some Lisp object), a function (some Lisp object), and
+a property list (usually a list of alternating keyword/value pairs).
+What makes symbols special is that there is usually only one symbol with
+a given name, and the symbol is referred to by name. This makes a
+symbol a convenient way of calling up data by name, i.e. of implementing
+variables. (The variable's value is stored in the "value slot".)
+Similarly, functions are referenced by name, and the definition of the
+function is stored in a symbol's "function slot". This means that
+there can be a distinct function and variable with the same name. The
+property list is used as a more general mechanism of associating
+additional values with particular names, and once again the namespace is
+independent of the function and variable namespaces.
- Some types of objects are "permanent", meaning that once created,
-they do not disappear until explicitly destroyed, using a function such
-as `delete-buffer', `delete-window', `delete-frame', etc. Others will
-disappear once they are not longer used, through the garbage collection
-mechanism. Buffers, frames, windows, devices, and processes are among
-the objects that are permanent. Note that some objects can go both
-ways: Faces can be created either way; extents are normally permanent,
-but detached extents (extents not referring to any text, as happens to
-some extents when the text they are referring to is deleted) are
-temporary. Note that some permanent objects, such as faces and coding
-systems, cannot be deleted. Note also that windows are unique in that
-they can be _undeleted_ after having previously been deleted. (This
-happens as a result of restoring a window configuration.)
-
- Note that many types of objects have a "read syntax", i.e. a way of
-specifying an object of that type in Lisp code. When you load a Lisp
-file, or type in code to be evaluated, what really happens is that the
-function `read' is called, which reads some text and creates an object
-based on the syntax of that text; then `eval' is called, which possibly
-does something special; then this loop repeats until there's no more
-text to read. (`eval' only actually does something special with
-symbols, which causes the symbol's value to be returned, similar to
-referencing a variable; and with conses [i.e. lists], which cause a
-function invocation. All other values are returned unchanged.)
+\1f
+File: internals.info, Node: Obarrays, Next: Symbol Values, Prev: Introduction to Symbols, Up: Symbols and Variables
+
+Obarrays
+========
+
+The identity of symbols with their names is accomplished through a
+structure called an obarray, which is just a poorly-implemented hash
+table mapping from strings to symbols whose name is that string. (I say
+"poorly implemented" because an obarray appears in Lisp as a vector
+with some hidden fields rather than as its own opaque type. This is an
+Emacs Lisp artifact that should be fixed.)
+
+ Obarrays are implemented as a vector of some fixed size (which should
+be a prime for best results), where each "bucket" of the vector
+contains one or more symbols, threaded through a hidden `next' field in
+the symbol. Lookup of a symbol in an obarray, and adding a symbol to
+an obarray, is accomplished through standard hash-table techniques.
+
+ The standard Lisp function for working with symbols and obarrays is
+`intern'. This looks up a symbol in an obarray given its name; if it's
+not found, a new symbol is automatically created with the specified
+name, added to the obarray, and returned. This is what happens when the
+Lisp reader encounters a symbol (or more precisely, encounters the name
+of a symbol) in some text that it is reading. There is a standard
+obarray called `obarray' that is used for this purpose, although the
+Lisp programmer is free to create his own obarrays and `intern' symbols
+in them.
+
+ Note that, once a symbol is in an obarray, it stays there until
+something is done about it, and the standard obarray `obarray' always
+stays around, so once you use any particular variable name, a
+corresponding symbol will stay around in `obarray' until you exit
+XEmacs.
+
+ Note that `obarray' itself is a variable, and as such there is a
+symbol in `obarray' whose name is `"obarray"' and which contains
+`obarray' as its value.
+
+ Note also that this call to `intern' occurs only when in the Lisp
+reader, not when the code is executed (at which point the symbol is
+already around, stored as such in the definition of the function).
+
+ You can create your own obarray using `make-vector' (this is
+horrible but is an artifact) and intern symbols into that obarray.
+Doing that will result in two or more symbols with the same name.
+However, at most one of these symbols is in the standard `obarray': You
+cannot have two symbols of the same name in any particular obarray.
+Note that you cannot add a symbol to an obarray in any fashion other
+than using `intern': i.e. you can't take an existing symbol and put it
+in an existing obarray. Nor can you change the name of an existing
+symbol. (Since obarrays are vectors, you can violate the consistency of
+things by storing directly into the vector, but let's ignore that
+possibility.)
+
+ Usually symbols are created by `intern', but if you really want, you
+can explicitly create a symbol using `make-symbol', giving it some
+name. The resulting symbol is not in any obarray (i.e. it is
+"uninterned"), and you can't add it to any obarray. Therefore its
+primary purpose is as a symbol to use in macros to avoid namespace
+pollution. It can also be used as a carrier of information, but cons
+cells could probably be used just as well.
+
+ You can also use `intern-soft' to look up a symbol but not create a
+new one, and `unintern' to remove a symbol from an obarray. This
+returns the removed symbol. (Remember: You can't put the symbol back
+into any obarray.) Finally, `mapatoms' maps over all of the symbols in
+an obarray.
- The read syntax
+\1f
+File: internals.info, Node: Symbol Values, Prev: Obarrays, Up: Symbols and Variables
+
+Symbol Values
+=============
+
+The value field of a symbol normally contains a Lisp object. However,
+a symbol can be "unbound", meaning that it logically has no value.
+This is internally indicated by storing a special Lisp object, called
+"the unbound marker" and stored in the global variable `Qunbound'. The
+unbound marker is of a special Lisp object type called
+"symbol-value-magic". It is impossible for the Lisp programmer to
+directly create or access any object of this type.
+
+ *You must not let any "symbol-value-magic" object escape to the Lisp
+level.* Printing any of these objects will cause the message `INTERNAL
+EMACS BUG' to appear as part of the print representation. (You may see
+this normally when you call `debug_print()' from the debugger on a Lisp
+object.) If you let one of these objects escape to the Lisp level, you
+will violate a number of assumptions contained in the C code and make
+the unbound marker not function right.
+
+ When a symbol is created, its value field (and function field) are
+set to `Qunbound'. The Lisp programmer can restore these conditions
+later using `makunbound' or `fmakunbound', and can query to see whether
+the value of function fields are "bound" (i.e. have a value other than
+`Qunbound') using `boundp' and `fboundp'. The fields are set to a
+normal Lisp object using `set' (or `setq') and `fset'.
+
+ Other symbol-value-magic objects are used as special markers to
+indicate variables that have non-normal properties. This includes any
+variables that are tied into C variables (setting the variable magically
+sets some global variable in the C code, and likewise for retrieving the
+variable's value), variables that magically tie into slots in the
+current buffer, variables that are buffer-local, etc. The
+symbol-value-magic object is stored in the value cell in place of a
+normal object, and the code to retrieve a symbol's value (i.e.
+`symbol-value') knows how to do special things with them. This means
+that you should not just fetch the value cell directly if you want a
+symbol's value.
+
+ The exact workings of this are rather complex and involved and are
+well-documented in comments in `buffer.c', `symbols.c', and `lisp.h'.
- 17297
+\1f
+File: internals.info, Node: Buffers and Textual Representation, Next: MULE Character Sets and Encodings, Prev: Symbols and Variables, Up: Top
- converts to an integer whose value is 17297.
+Buffers and Textual Representation
+**********************************
- 1.983e-4
+* Menu:
- converts to a float whose value is 1.983e-4, or .0001983.
+* Introduction to Buffers:: A buffer holds a block of text such as a file.
+* The Text in a Buffer:: Representation of the text in a buffer.
+* Buffer Lists:: Keeping track of all buffers.
+* Markers and Extents:: Tagging locations within a buffer.
+* Bufbytes and Emchars:: Representation of individual characters.
+* The Buffer Object:: The Lisp object corresponding to a buffer.
- ?b
+\1f
+File: internals.info, Node: Introduction to Buffers, Next: The Text in a Buffer, Up: Buffers and Textual Representation
- converts to a char that represents the lowercase letter b.
+Introduction to Buffers
+=======================
- ?^[$(B#&^[(B
+A buffer is logically just a Lisp object that holds some text. In
+this, it is like a string, but a buffer is optimized for frequent
+insertion and deletion, while a string is not. Furthermore:
+
+ 1. Buffers are "permanent" objects, i.e. once you create them, they
+ remain around, and need to be explicitly deleted before they go
+ away.
+
+ 2. Each buffer has a unique name, which is a string. Buffers are
+ normally referred to by name. In this respect, they are like
+ symbols.
+
+ 3. Buffers have a default insertion position, called "point".
+ Inserting text (unless you explicitly give a position) goes at
+ point, and moves point forward past the text. This is what is
+ going on when you type text into Emacs.
+
+ 4. Buffers have lots of extra properties associated with them.
+
+ 5. Buffers can be "displayed". What this means is that there exist a
+ number of "windows", which are objects that correspond to some
+ visible section of your display, and each window has an associated
+ buffer, and the current contents of the buffer are shown in that
+ section of the display. The redisplay mechanism (which takes care
+ of doing this) knows how to look at the text of a buffer and come
+ up with some reasonable way of displaying this. Many of the
+ properties of a buffer control how the buffer's text is displayed.
+
+ 6. One buffer is distinguished and called the "current buffer". It is
+ stored in the variable `current_buffer'. Buffer operations operate
+ on this buffer by default. When you are typing text into a
+ buffer, the buffer you are typing into is always `current_buffer'.
+ Switching to a different window changes the current buffer. Note
+ that Lisp code can temporarily change the current buffer using
+ `set-buffer' (often enclosed in a `save-excursion' so that the
+ former current buffer gets restored when the code is finished).
+ However, calling `set-buffer' will NOT cause a permanent change in
+ the current buffer. The reason for this is that the top-level
+ event loop sets `current_buffer' to the buffer of the selected
+ window, each time it finishes executing a user command.
+
+ Make sure you understand the distinction between "current buffer"
+and "buffer of the selected window", and the distinction between
+"point" of the current buffer and "window-point" of the selected
+window. (This latter distinction is explained in detail in the section
+on windows.)
- (where `^[' actually is an `ESC' character) converts to a particular
-Kanji character when using an ISO2022-based coding system for input.
-(To decode this goo: `ESC' begins an escape sequence; `ESC $ (' is a
-class of escape sequences meaning "switch to a 94x94 character set";
-`ESC $ ( B' means "switch to Japanese Kanji"; `#' and `&' collectively
-index into a 94-by-94 array of characters [subtract 33 from the ASCII
-value of each character to get the corresponding index]; `ESC (' is a
-class of escape sequences meaning "switch to a 94 character set"; `ESC
-(B' means "switch to US ASCII". It is a coincidence that the letter
-`B' is used to denote both Japanese Kanji and US ASCII. If the first
-`B' were replaced with an `A', you'd be requesting a Chinese Hanzi
-character from the GB2312 character set.)
+\1f
+File: internals.info, Node: The Text in a Buffer, Next: Buffer Lists, Prev: Introduction to Buffers, Up: Buffers and Textual Representation
+
+The Text in a Buffer
+====================
+
+The text in a buffer consists of a sequence of zero or more characters.
+A "character" is an integer that logically represents a letter,
+number, space, or other unit of text. Most of the characters that you
+will typically encounter belong to the ASCII set of characters, but
+there are also characters for various sorts of accented letters,
+special symbols, Chinese and Japanese ideograms (i.e. Kanji, Katakana,
+etc.), Cyrillic and Greek letters, etc. The actual number of possible
+characters is quite large.
+
+ For now, we can view a character as some non-negative integer that
+has some shape that defines how it typically appears (e.g. as an
+uppercase A). (The exact way in which a character appears depends on the
+font used to display the character.) The internal type of characters in
+the C code is an `Emchar'; this is just an `int', but using a symbolic
+type makes the code clearer.
+
+ Between every character in a buffer is a "buffer position" or
+"character position". We can speak of the character before or after a
+particular buffer position, and when you insert a character at a
+particular position, all characters after that position end up at new
+positions. When we speak of the character "at" a position, we really
+mean the character after the position. (This schizophrenia between a
+buffer position being "between" a character and "on" a character is
+rampant in Emacs.)
+
+ Buffer positions are numbered starting at 1. This means that
+position 1 is before the first character, and position 0 is not valid.
+If there are N characters in a buffer, then buffer position N+1 is
+after the last one, and position N+2 is not valid.
+
+ The internal makeup of the Emchar integer varies depending on whether
+we have compiled with MULE support. If not, the Emchar integer is an
+8-bit integer with possible values from 0 - 255. 0 - 127 are the
+standard ASCII characters, while 128 - 255 are the characters from the
+ISO-8859-1 character set. If we have compiled with MULE support, an
+Emchar is a 19-bit integer, with the various bits having meanings
+according to a complex scheme that will be detailed later. The
+characters numbered 0 - 255 still have the same meanings as for the
+non-MULE case, though.
+
+ Internally, the text in a buffer is represented in a fairly simple
+fashion: as a contiguous array of bytes, with a "gap" of some size in
+the middle. Although the gap is of some substantial size in bytes,
+there is no text contained within it: From the perspective of the text
+in the buffer, it does not exist. The gap logically sits at some buffer
+position, between two characters (or possibly at the beginning or end of
+the buffer). Insertion of text in a buffer at a particular position is
+always accomplished by first moving the gap to that position (i.e.
+through some block moving of text), then writing the text into the
+beginning of the gap, thereby shrinking the gap. If the gap shrinks
+down to nothing, a new gap is created. (What actually happens is that a
+new gap is "created" at the end of the buffer's text, which requires
+nothing more than changing a couple of indices; then the gap is "moved"
+to the position where the insertion needs to take place by moving up in
+memory all the text after that position.) Similarly, deletion occurs
+by moving the gap to the place where the text is to be deleted, and
+then simply expanding the gap to include the deleted text.
+("Expanding" and "shrinking" the gap as just described means just that
+the internal indices that keep track of where the gap is located are
+changed.)
+
+ Note that the total amount of memory allocated for a buffer text
+never decreases while the buffer is live. Therefore, if you load up a
+20-megabyte file and then delete all but one character, there will be a
+20-megabyte gap, which won't get any smaller (except by inserting
+characters back again). Once the buffer is killed, the memory allocated
+for the buffer text will be freed, but it will still be sitting on the
+heap, taking up virtual memory, and will not be released back to the
+operating system. (However, if you have compiled XEmacs with rel-alloc,
+the situation is different. In this case, the space _will_ be released
+back to the operating system. However, this tends to result in a
+noticeable speed penalty.)
+
+ Astute readers may notice that the text in a buffer is represented as
+an array of _bytes_, while (at least in the MULE case) an Emchar is a
+19-bit integer, which clearly cannot fit in a byte. This means (of
+course) that the text in a buffer uses a different representation from
+an Emchar: specifically, the 19-bit Emchar becomes a series of one to
+four bytes. The conversion between these two representations is complex
+and will be described later.
+
+ In the non-MULE case, everything is very simple: An Emchar is an
+8-bit value, which fits neatly into one byte.
+
+ If we are given a buffer position and want to retrieve the character
+at that position, we need to follow these steps:
+
+ 1. Pretend there's no gap, and convert the buffer position into a
+ "byte index" that indexes to the appropriate byte in the buffer's
+ stream of textual bytes. By convention, byte indices begin at 1,
+ just like buffer positions. In the non-MULE case, byte indices
+ and buffer positions are identical, since one character equals one
+ byte.
+
+ 2. Convert the byte index into a "memory index", which takes the gap
+ into account. The memory index is a direct index into the block of
+ memory that stores the text of a buffer. This basically just
+ involves checking to see if the byte index is past the gap, and if
+ so, adding the size of the gap to it. By convention, memory
+ indices begin at 1, just like buffer positions and byte indices,
+ and when referring to the position that is "at" the gap, we always
+ use the memory position at the _beginning_, not at the end, of the
+ gap.
+
+ 3. Fetch the appropriate bytes at the determined memory position.
+
+ 4. Convert these bytes into an Emchar.
+
+ In the non-Mule case, (3) and (4) boil down to a simple one-byte
+memory access.
+
+ Note that we have defined three types of positions in a buffer:
+
+ 1. "buffer positions" or "character positions", typedef `Bufpos'
+
+ 2. "byte indices", typedef `Bytind'
+
+ 3. "memory indices", typedef `Memind'
+
+ All three typedefs are just `int's, but defining them this way makes
+things a lot clearer.
+
+ Most code works with buffer positions. In particular, all Lisp code
+that refers to text in a buffer uses buffer positions. Lisp code does
+not know that byte indices or memory indices exist.
+
+ Finally, we have a typedef for the bytes in a buffer. This is a
+`Bufbyte', which is an unsigned char. Referring to them as Bufbytes
+underscores the fact that we are working with a string of bytes in the
+internal Emacs buffer representation rather than in one of a number of
+possible alternative representations (e.g. EUC-encoded text, etc.).
- "foobar"
+\1f
+File: internals.info, Node: Buffer Lists, Next: Markers and Extents, Prev: The Text in a Buffer, Up: Buffers and Textual Representation
+
+Buffer Lists
+============
+
+Recall earlier that buffers are "permanent" objects, i.e. that they
+remain around until explicitly deleted. This entails that there is a
+list of all the buffers in existence. This list is actually an
+assoc-list (mapping from the buffer's name to the buffer) and is stored
+in the global variable `Vbuffer_alist'.
+
+ The order of the buffers in the list is important: the buffers are
+ordered approximately from most-recently-used to least-recently-used.
+Switching to a buffer using `switch-to-buffer', `pop-to-buffer', etc.
+and switching windows using `other-window', etc. usually brings the
+new current buffer to the front of the list. `switch-to-buffer',
+`other-buffer', etc. look at the beginning of the list to find an
+alternative buffer to suggest. You can also explicitly move a buffer
+to the end of the list using `bury-buffer'.
+
+ In addition to the global ordering in `Vbuffer_alist', each frame
+has its own ordering of the list. These lists always contain the same
+elements as in `Vbuffer_alist' although possibly in a different order.
+`buffer-list' normally returns the list for the selected frame. This
+allows you to work in separate frames without things interfering with
+each other.
+
+ The standard way to look up a buffer given a name is `get-buffer',
+and the standard way to create a new buffer is `get-buffer-create',
+which looks up a buffer with a given name, creating a new one if
+necessary. These operations correspond exactly with the symbol
+operations `intern-soft' and `intern', respectively. You can also
+force a new buffer to be created using `generate-new-buffer', which
+takes a name and (if necessary) makes a unique name from this by
+appending a number, and then creates the buffer. This is basically
+like the symbol operation `gensym'.
+
+\1f
+File: internals.info, Node: Markers and Extents, Next: Bufbytes and Emchars, Prev: Buffer Lists, Up: Buffers and Textual Representation
+
+Markers and Extents
+===================
+
+Among the things associated with a buffer are things that are logically
+attached to certain buffer positions. This can be used to keep track
+of a buffer position when text is inserted and deleted, so that it
+remains at the same spot relative to the text around it; to assign
+properties to particular sections of text; etc. There are two such
+objects that are useful in this regard: they are "markers" and
+"extents".
+
+ A "marker" is simply a flag placed at a particular buffer position,
+which is moved around as text is inserted and deleted. Markers are
+used for all sorts of purposes, such as the `mark' that is the other
+end of textual regions to be cut, copied, etc.
+
+ An "extent" is similar to two markers plus some associated
+properties, and is used to keep track of regions in a buffer as text is
+inserted and deleted, and to add properties (e.g. fonts) to particular
+regions of text. The external interface of extents is explained
+elsewhere.
+
+ The important thing here is that markers and extents simply contain
+buffer positions in them as integers, and every time text is inserted or
+deleted, these positions must be updated. In order to minimize the
+amount of shuffling that needs to be done, the positions in markers and
+extents (there's one per marker, two per extent) are stored in Meminds.
+This means that they only need to be moved when the text is physically
+moved in memory; since the gap structure tries to minimize this, it also
+minimizes the number of marker and extent indices that need to be
+adjusted. Look in `insdel.c' for the details of how this works.
+
+ One other important distinction is that markers are "temporary"
+while extents are "permanent". This means that markers disappear as
+soon as there are no more pointers to them, and correspondingly, there
+is no way to determine what markers are in a buffer if you are just
+given the buffer. Extents remain in a buffer until they are detached
+(which could happen as a result of text being deleted) or the buffer is
+deleted, and primitives do exist to enumerate the extents in a buffer.
- converts to a string.
+\1f
+File: internals.info, Node: Bufbytes and Emchars, Next: The Buffer Object, Prev: Markers and Extents, Up: Buffers and Textual Representation
- foobar
+Bufbytes and Emchars
+====================
- converts to a symbol whose name is `"foobar"'. This is done by
-looking up the string equivalent in the global variable `obarray',
-whose contents should be an obarray. If no symbol is found, a new
-symbol with the name `"foobar"' is automatically created and added to
-`obarray'; this process is called "interning" the symbol.
+Not yet documented.
- (foo . bar)
+\1f
+File: internals.info, Node: The Buffer Object, Prev: Bufbytes and Emchars, Up: Buffers and Textual Representation
+
+The Buffer Object
+=================
+
+Buffers contain fields not directly accessible by the Lisp programmer.
+We describe them here, naming them by the names used in the C code.
+Many are accessible indirectly in Lisp programs via Lisp primitives.
+
+`name'
+ The buffer name is a string that names the buffer. It is
+ guaranteed to be unique. *Note Buffer Names: (lispref)Buffer
+ Names.
+
+`save_modified'
+ This field contains the time when the buffer was last saved, as an
+ integer. *Note Buffer Modification: (lispref)Buffer Modification.
+
+`modtime'
+ This field contains the modification time of the visited file. It
+ is set when the file is written or read. Every time the buffer is
+ written to the file, this field is compared to the modification
+ time of the file. *Note Buffer Modification: (lispref)Buffer
+ Modification.
+
+`auto_save_modified'
+ This field contains the time when the buffer was last auto-saved.
+
+`last_window_start'
+ This field contains the `window-start' position in the buffer as of
+ the last time the buffer was displayed in a window.
+
+`undo_list'
+ This field points to the buffer's undo list. *Note Undo:
+ (lispref)Undo.
+
+`syntax_table_v'
+ This field contains the syntax table for the buffer. *Note Syntax
+ Tables: (lispref)Syntax Tables.
+
+`downcase_table'
+ This field contains the conversion table for converting text to
+ lower case. *Note Case Tables: (lispref)Case Tables.
+
+`upcase_table'
+ This field contains the conversion table for converting text to
+ upper case. *Note Case Tables: (lispref)Case Tables.
+
+`case_canon_table'
+ This field contains the conversion table for canonicalizing text
+ for case-folding search. *Note Case Tables: (lispref)Case Tables.
+
+`case_eqv_table'
+ This field contains the equivalence table for case-folding search.
+ *Note Case Tables: (lispref)Case Tables.
+
+`display_table'
+ This field contains the buffer's display table, or `nil' if it
+ doesn't have one. *Note Display Tables: (lispref)Display Tables.
+
+`markers'
+ This field contains the chain of all markers that currently point
+ into the buffer. Deletion of text in the buffer, and motion of
+ the buffer's gap, must check each of these markers and perhaps
+ update it. *Note Markers: (lispref)Markers.
+
+`backed_up'
+ This field is a flag that tells whether a backup file has been
+ made for the visited file of this buffer.
+
+`mark'
+ This field contains the mark for the buffer. The mark is a marker,
+ hence it is also included on the list `markers'. *Note The Mark:
+ (lispref)The Mark.
+
+`mark_active'
+ This field is non-`nil' if the buffer's mark is active.
+
+`local_var_alist'
+ This field contains the association list describing the variables
+ local in this buffer, and their values, with the exception of
+ local variables that have special slots in the buffer object.
+ (Those slots are omitted from this table.) *Note Buffer-Local
+ Variables: (lispref)Buffer-Local Variables.
+
+`modeline_format'
+ This field contains a Lisp object which controls how to display
+ the mode line for this buffer. *Note Modeline Format:
+ (lispref)Modeline Format.
+
+`base_buffer'
+ This field holds the buffer's base buffer (if it is an indirect
+ buffer), or `nil'.
- converts to a cons cell containing the symbols `foo' and `bar'.
+\1f
+File: internals.info, Node: MULE Character Sets and Encodings, Next: The Lisp Reader and Compiler, Prev: Buffers and Textual Representation, Up: Top
+
+MULE Character Sets and Encodings
+*********************************
+
+Recall that there are two primary ways that text is represented in
+XEmacs. The "buffer" representation sees the text as a series of bytes
+(Bufbytes), with a variable number of bytes used per character. The
+"character" representation sees the text as a series of integers
+(Emchars), one per character. The character representation is a cleaner
+representation from a theoretical standpoint, and is thus used in many
+cases when lots of manipulations on a string need to be done. However,
+the buffer representation is the standard representation used in both
+Lisp strings and buffers, and because of this, it is the "default"
+representation that text comes in. The reason for using this
+representation is that it's compact and is compatible with ASCII.
- (1 a 2.5)
+* Menu:
- converts to a three-element list containing the specified objects
-(note that a list is actually a set of nested conses; see the XEmacs
-Lisp Reference).
+* Character Sets::
+* Encodings::
+* Internal Mule Encodings::
+* CCL::
- [1 a 2.5]
+\1f
+File: internals.info, Node: Character Sets, Next: Encodings, Up: MULE Character Sets and Encodings
+
+Character Sets
+==============
+
+A character set (or "charset") is an ordered set of characters. A
+particular character in a charset is indexed using one or more
+"position codes", which are non-negative integers. The number of
+position codes needed to identify a particular character in a charset is
+called the "dimension" of the charset. In XEmacs/Mule, all charsets
+have dimension 1 or 2, and the size of all charsets (except for a few
+special cases) is either 94, 96, 94 by 94, or 96 by 96. The range of
+position codes used to index characters from any of these types of
+character sets is as follows:
+
+ Charset type Position code 1 Position code 2
+ ------------------------------------------------------------
+ 94 33 - 126 N/A
+ 96 32 - 127 N/A
+ 94x94 33 - 126 33 - 126
+ 96x96 32 - 127 32 - 127
+
+ Note that in the above cases position codes do not start at an
+expected value such as 0 or 1. The reason for this will become clear
+later.
+
+ For example, Latin-1 is a 96-character charset, and JISX0208 (the
+Japanese national character set) is a 94x94-character charset.
+
+ [Note that, although the ranges above define the _valid_ position
+codes for a charset, some of the slots in a particular charset may in
+fact be empty. This is the case for JISX0208, for example, where (e.g.)
+all the slots whose first position code is in the range 118 - 127 are
+empty.]
+
+ There are three charsets that do not follow the above rules. All of
+them have one dimension, and have ranges of position codes as follows:
+
+ Charset name Position code 1
+ ------------------------------------
+ ASCII 0 - 127
+ Control-1 0 - 31
+ Composite 0 - some large number
+
+ (The upper bound of the position code for composite characters has
+not yet been determined, but it will probably be at least 16,383).
+
+ ASCII is the union of two subsidiary character sets: Printing-ASCII
+(the printing ASCII character set, consisting of position codes 33 -
+126, like for a standard 94-character charset) and Control-ASCII (the
+non-printing characters that would appear in a binary file with codes 0
+- 32 and 127).
+
+ Control-1 contains the non-printing characters that would appear in a
+binary file with codes 128 - 159.
+
+ Composite contains characters that are generated by overstriking one
+or more characters from other charsets.
+
+ Note that some characters in ASCII, and all characters in Control-1,
+are "control" (non-printing) characters. These have no printed
+representation but instead control some other function of the printing
+(e.g. TAB or 8 moves the current character position to the next tab
+stop). All other characters in all charsets are "graphic" (printing)
+characters.
+
+ When a binary file is read in, the bytes in the file are assigned to
+character sets as follows:
+
+ Bytes Character set Range
+ --------------------------------------------------
+ 0 - 127 ASCII 0 - 127
+ 128 - 159 Control-1 0 - 31
+ 160 - 255 Latin-1 32 - 127
+
+ This is a bit ad-hoc but gets the job done.
- converts to a three-element vector containing the specified objects.
+\1f
+File: internals.info, Node: Encodings, Next: Internal Mule Encodings, Prev: Character Sets, Up: MULE Character Sets and Encodings
- #[... ... ... ...]
+Encodings
+=========
- converts to a compiled-function object (the actual contents are not
-shown since they are not relevant here; look at a file that ends with
-`.elc' for examples).
+An "encoding" is a way of numerically representing characters from one
+or more character sets. If an encoding only encompasses one character
+set, then the position codes for the characters in that character set
+could be used directly. This is not possible, however, if more than
+one character set is to be used in the encoding.
- #*01110110
+ For example, the conversion detailed above between bytes in a binary
+file and characters is effectively an encoding that encompasses the
+three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit
+bytes.
- converts to a bit-vector.
+ Thus, an encoding can be viewed as a way of encoding characters from
+a specified group of character sets using a stream of bytes, each of
+which contains a fixed number of bits (but not necessarily 8, as in the
+common usage of "byte").
- #s(hash-table ... ...)
+ Here are descriptions of a couple of common encodings:
- converts to a hash table (the actual contents are not shown).
+* Menu:
- #s(range-table ... ...)
+* Japanese EUC (Extended Unix Code)::
+* JIS7::
- converts to a range table (the actual contents are not shown).
+\1f
+File: internals.info, Node: Japanese EUC (Extended Unix Code), Next: JIS7, Up: Encodings
- #s(char-table ... ...)
+Japanese EUC (Extended Unix Code)
+---------------------------------
- converts to a char table (the actual contents are not shown).
+This encompasses the character sets Printing-ASCII, Japanese-JISX0201,
+and Japanese-JISX0208-Kana (half-width katakana, the right half of
+JISX0201). It uses 8-bit bytes.
- Note that the `#s()' syntax is the general syntax for structures,
-which are not really implemented in XEmacs Lisp but should be.
+ Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character
+charsets, while Japanese-JISX0208 is a 94x94-character charset.
- When an object is printed out (using `print' or a related function),
-the read syntax is used, so that the same object can be read in again.
+ The encoding is as follows:
- The other objects do not have read syntaxes, usually because it does
-not really make sense to create them in this fashion (i.e. processes,
-where it doesn't make sense to have a subprocess created as a side
-effect of reading some Lisp code), or because they can't be created at
-all (e.g. subrs). Permanent objects, as a rule, do not have a read
-syntax; nor do most complex objects, which contain too much state to be
-easily initialized through a read syntax.
+ Character set Representation (PC=position-code)
+ ------------- --------------
+ Printing-ASCII PC1
+ Japanese-JISX0201-Kana 0x8E | PC1 + 0x80
+ Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
+ Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80
\1f
-File: internals.info, Node: How Lisp Objects Are Represented in C, Next: Rules When Writing New C Code, Prev: The XEmacs Object System (Abstractly Speaking), Up: Top
+File: internals.info, Node: JIS7, Prev: Japanese EUC (Extended Unix Code), Up: Encodings
+
+JIS7
+----
-How Lisp Objects Are Represented in C
-*************************************
+This encompasses the character sets Printing-ASCII,
+Japanese-JISX0201-Roman (the left half of JISX0201; this character set
+is very similar to Printing-ASCII and is a 94-character charset),
+Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes.
- Lisp objects are represented in C using a 32-bit or 64-bit machine
-word (depending on the processor; i.e. DEC Alphas use 64-bit Lisp
-objects and most other processors use 32-bit Lisp objects). The
-representation stuffs a pointer together with a tag, as follows:
+ Unlike Japanese EUC, this is a "modal" encoding, which means that
+there are multiple states that the encoding can be in, which affect how
+the bytes are to be interpreted. Special sequences of bytes (called
+"escape sequences") are used to change states.
- [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
- [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ]
+ The encoding is as follows:
+
+ Character set Representation (PC=position-code)
+ ------------- --------------
+ Printing-ASCII PC1
+ Japanese-JISX0201-Roman PC1
+ Japanese-JISX0201-Kana PC1
+ Japanese-JISX0208 PC1 PC2
+
- <---------------------------------------------------------> <->
- a pointer to a structure, or an integer tag
-
- A tag of 00 is used for all pointer object types, a tag of 10 is used
-for characters, and the other two tags 01 and 11 are joined together to
-form the integer object type. This representation gives us 31 bit
-integers and 30 bit characters, while pointers are represented directly
-without any bit masking or shifting. This representation, though,
-assumes that pointers to structs are always aligned to multiples of 4,
-so the lower 2 bits are always zero.
-
- Lisp objects use the typedef `Lisp_Object', but the actual C type
-used for the Lisp object can vary. It can be either a simple type
-(`long' on the DEC Alpha, `int' on other machines) or a structure whose
-fields are bit fields that line up properly (actually, a union of
-structures is used). Generally the simple integral type is preferable
-because it ensures that the compiler will actually use a machine word
-to represent the object (some compilers will use more general and less
-efficient code for unions and structs even if they can fit in a machine
-word). The union type, however, has the advantage of stricter type
-checking. If you accidentally pass an integer where a Lisp object is
-desired, you get a compile error. The choice of which type to use is
-determined by the preprocessor constant `USE_UNION_TYPE' which is
-defined via the `--use-union-type' option to `configure'.
-
- Various macros are used to convert between Lisp_Objects and the
-corresponding C type. Macros of the form `XINT()', `XCHAR()',
-`XSTRING()', `XSYMBOL()', do any required bit shifting and/or masking
-and cast it to the appropriate type. `XINT()' needs to be a bit tricky
-so that negative numbers are properly sign-extended. Since integers
-are stored left-shifted, if the right-shift operator does an arithmetic
-shift (i.e. it leaves the most-significant bit as-is rather than
-shifting in a zero, so that it mimics a divide-by-two even for negative
-numbers) the shift to remove the tag bit is enough. This is the case
-on all the systems we support.
-
- Note that when `ERROR_CHECK_TYPECHECK' is defined, the converter
-macros become more complicated--they check the tag bits and/or the type
-field in the first four bytes of a record type to ensure that the
-object is really of the correct type. This is great for catching places
-where an incorrect type is being dereferenced--this typically results
-in a pointer being dereferenced as the wrong type of structure, with
-unpredictable (and sometimes not easily traceable) results.
-
- There are similar `XSETTYPE()' macros that construct a Lisp object.
-These macros are of the form `XSETTYPE (LVALUE, RESULT)', i.e. they
-have to be a statement rather than just used in an expression. The
-reason for this is that standard C doesn't let you "construct" a
-structure (but GCC does). Granted, this sometimes isn't too
-convenient; for the case of integers, at least, you can use the
-function `make_int()', which constructs and _returns_ an integer Lisp
-object. Note that the `XSETTYPE()' macros are also affected by
-`ERROR_CHECK_TYPECHECK' and make sure that the structure is of the
-right type in the case of record types, where the type is contained in
-the structure.
-
- The C programmer is responsible for *guaranteeing* that a
-Lisp_Object is the correct type before using the `XTYPE' macros. This
-is especially important in the case of lists. Use `XCAR' and `XCDR' if
-a Lisp_Object is certainly a cons cell, else use `Fcar()' and `Fcdr()'.
-Trust other C code, but not Lisp code. On the other hand, if XEmacs
-has an internal logic error, it's better to crash immediately, so
-sprinkle `assert()'s and "unreachable" `abort()'s liberally about the
-source code. Where performance is an issue, use `type_checking_assert',
-`bufpos_checking_assert', and `gc_checking_assert', which do nothing
-unless the corresponding configure error checking flag was specified.
+ Escape sequence ASCII equivalent Meaning
+ --------------- ---------------- -------
+ 0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman
+ 0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana
+ 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208
+ 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
+
+ Initially, Printing-ASCII is invoked.
\1f
-File: internals.info, Node: Rules When Writing New C Code, Next: A Summary of the Various XEmacs Modules, Prev: How Lisp Objects Are Represented in C, Up: Top
+File: internals.info, Node: Internal Mule Encodings, Next: CCL, Prev: Encodings, Up: MULE Character Sets and Encodings
-Rules When Writing New C Code
-*****************************
+Internal Mule Encodings
+=======================
- The XEmacs C Code is extremely complex and intricate, and there are
-many rules that are more or less consistently followed throughout the
-code. Many of these rules are not obvious, so they are explained here.
-It is of the utmost importance that you follow them. If you don't,
-you may get something that appears to work, but which will crash in odd
-situations, often in code far away from where the actual breakage is.
+In XEmacs/Mule, each character set is assigned a unique number, called a
+"leading byte". This is used in the encodings of a character. Leading
+bytes are in the range 0x80 - 0xFF (except for ASCII, which has a
+leading byte of 0), although some leading bytes are reserved.
+
+ Charsets whose leading byte is in the range 0x80 - 0x9F are called
+"official" and are used for built-in charsets. Other charsets are
+called "private" and have leading bytes in the range 0xA0 - 0xFF; these
+are user-defined charsets.
+
+ More specifically:
+
+ Character set Leading byte
+ ------------- ------------
+ ASCII 0
+ Composite 0x80
+ Dimension-1 Official 0x81 - 0x8D
+ (0x8E is free)
+ Control-1 0x8F
+ Dimension-2 Official 0x90 - 0x99
+ (0x9A - 0x9D are free;
+ 0x9E and 0x9F are reserved)
+ Dimension-1 Private 0xA0 - 0xEF
+ Dimension-2 Private 0xF0 - 0xFF
+
+ There are two internal encodings for characters in XEmacs/Mule. One
+is called "string encoding" and is an 8-bit encoding that is used for
+representing characters in a buffer or string. It uses 1 to 4 bytes per
+character. The other is called "character encoding" and is a 19-bit
+encoding that is used for representing characters individually in a
+variable.
+
+ (In the following descriptions, we'll ignore composite characters for
+the moment. We also give a general (structural) overview first,
+followed later by the exact details.)
* Menu:
-* General Coding Rules::
-* Writing Lisp Primitives::
-* Writing Good Comments::
-* Adding Global Lisp Variables::
-* Proper Use of Unsigned Types::
-* Coding for Mule::
-* Techniques for XEmacs Developers::
+* Internal String Encoding::
+* Internal Character Encoding::
\1f
-File: internals.info, Node: General Coding Rules, Next: Writing Lisp Primitives, Up: Rules When Writing New C Code
+File: internals.info, Node: Internal String Encoding, Next: Internal Character Encoding, Up: Internal Mule Encodings
+
+Internal String Encoding
+------------------------
+
+ASCII characters are encoded using their position code directly. Other
+characters are encoded using their leading byte followed by their
+position code(s) with the high bit set. Characters in private character
+sets have their leading byte prefixed with a "leading byte prefix",
+which is either 0x9E or 0x9F. (No character sets are ever assigned these
+leading bytes.) Specifically:
+
+ Character set Encoding (PC=position-code, LB=leading-byte)
+ ------------- --------
+ ASCII PC-1 |
+ Control-1 LB | PC1 + 0xA0 |
+ Dimension-1 official LB | PC1 + 0x80 |
+ Dimension-1 private 0x9E | LB | PC1 + 0x80 |
+ Dimension-2 official LB | PC1 + 0x80 | PC2 + 0x80 |
+ Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80
+
+ The basic characteristic of this encoding is that the first byte of
+all characters is in the range 0x00 - 0x9F, and the second and
+following bytes of all characters is in the range 0xA0 - 0xFF. This
+means that it is impossible to get out of sync, or more specifically:
+
+ 1. Given any byte position, the beginning of the character it is
+ within can be determined in constant time.
+
+ 2. Given any byte position at the beginning of a character, the
+ beginning of the next character can be determined in constant time.
+
+ 3. Given any byte position at the beginning of a character, the
+ beginning of the previous character can be determined in constant
+ time.
+
+ 4. Textual searches can simply treat encoded strings as if they were
+ encoded in a one-byte-per-character fashion rather than the actual
+ multi-byte encoding.
+
+ None of the standard non-modal encodings meet all of these
+conditions. For example, EUC satisfies only (2) and (3), while
+Shift-JIS and Big5 (not yet described) satisfy only (2). (All non-modal
+encodings must satisfy (2), in order to be unambiguous.)
-General Coding Rules
-====================
-
- The C code is actually written in a dialect of C called "Clean C",
-meaning that it can be compiled, mostly warning-free, with either a C or
-C++ compiler. Coding in Clean C has several advantages over plain C.
-C++ compilers are more nit-picking, and a number of coding errors have
-been found by compiling with C++. The ability to use both C and C++
-tools means that a greater variety of development tools are available to
-the developer.
-
- Every module includes `<config.h>' (angle brackets so that
-`--srcdir' works correctly; `config.h' may or may not be in the same
-directory as the C sources) and `lisp.h'. `config.h' must always be
-included before any other header files (including system header files)
-to ensure that certain tricks played by various `s/' and `m/' files
-work out correctly.
-
- When including header files, always use angle brackets, not double
-quotes, except when the file to be included is always in the same
-directory as the including file. If either file is a generated file,
-then that is not likely to be the case. In order to understand why we
-have this rule, imagine what happens when you do a build in the source
-directory using `./configure' and another build in another directory
-using `../work/configure'. There will be two different `config.h'
-files. Which one will be used if you `#include "config.h"'?
-
- Almost every module contains a `syms_of_*()' function and a
-`vars_of_*()' function. The former declares any Lisp primitives you
-have defined and defines any symbols you will be using. The latter
-declares any global Lisp variables you have added and initializes global
-C variables in the module. *Important*: There are stringent
-requirements on exactly what can go into these functions. See the
-comment in `emacs.c'. The reason for this is to avoid obscure unwanted
-interactions during initialization. If you don't follow these rules,
-you'll be sorry! If you want to do anything that isn't allowed, create
-a `complex_vars_of_*()' function for it. Doing this is tricky, though:
-you have to make sure your function is called at the right time so that
-all the initialization dependencies work out.
-
- Declare each function of these kinds in `symsinit.h'. Make sure
-it's called in the appropriate place in `emacs.c'. You never need to
-include `symsinit.h' directly, because it is included by `lisp.h'.
-
- *All global and static variables that are to be modifiable must be
-declared uninitialized.* This means that you may not use the "declare
-with initializer" form for these variables, such as `int some_variable
-= 0;'. The reason for this has to do with some kludges done during the
-dumping process: If possible, the initialized data segment is re-mapped
-so that it becomes part of the (unmodifiable) code segment in the
-dumped executable. This allows this memory to be shared among multiple
-running XEmacs processes. XEmacs is careful to place as much constant
-data as possible into initialized variables during the `temacs' phase.
-
- *Please note:* This kludge only works on a few systems nowadays, and
-is rapidly becoming irrelevant because most modern operating systems
-provide "copy-on-write" semantics. All data is initially shared
-between processes, and a private copy is automatically made (on a
-page-by-page basis) when a process first attempts to write to a page of
-memory.
-
- Formerly, there was a requirement that static variables not be
-declared inside of functions. This had to do with another hack along
-the same vein as what was just described: old USG systems put
-statically-declared variables in the initialized data space, so those
-header files had a `#define static' declaration. (That way, the
-data-segment remapping described above could still work.) This fails
-badly on static variables inside of functions, which suddenly become
-automatic variables; therefore, you weren't supposed to have any of
-them. This awful kludge has been removed in XEmacs because
-
- 1. almost all of the systems that used this kludge ended up having to
- disable the data-segment remapping anyway;
-
- 2. the only systems that didn't were extremely outdated ones;
-
- 3. this hack completely messed up inline functions.
-
- The C source code makes heavy use of C preprocessor macros. One
-popular macro style is:
-
- #define FOO(var, value) do { \
- Lisp_Object FOO_value = (value); \
- ... /* compute using FOO_value */ \
- (var) = bar; \
- } while (0)
-
- The `do {...} while (0)' is a standard trick to allow FOO to have
-statement semantics, so that it can safely be used within an `if'
-statement in C, for example. Multiple evaluation is prevented by
-copying a supplied argument into a local variable, so that
-`FOO(var,fun(1))' only calls `fun' once.
-
- Lisp lists are popular data structures in the C code as well as in
-Elisp. There are two sets of macros that iterate over lists.
-`EXTERNAL_LIST_LOOP_N' should be used when the list has been supplied
-by the user, and cannot be trusted to be acyclic and `nil'-terminated.
-A `malformed-list' or `circular-list' error will be generated if the
-list being iterated over is not entirely kosher. `LIST_LOOP_N', on the
-other hand, is faster and less safe, and can be used only on trusted
-lists.
-
- Related macros are `GET_EXTERNAL_LIST_LENGTH' and `GET_LIST_LENGTH',
-which calculate the length of a list, and in the case of
-`GET_EXTERNAL_LIST_LENGTH', validating the properness of the list. The
-macros `EXTERNAL_LIST_LOOP_DELETE_IF' and `LIST_LOOP_DELETE_IF' delete
-elements from a lisp list satisfying some predicate.
+\1f
+File: internals.info, Node: Internal Character Encoding, Prev: Internal String Encoding, Up: Internal Mule Encodings
+
+Internal Character Encoding
+---------------------------
+
+One 19-bit word represents a single character. The word is separated
+into three fields:
+
+ Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
+ <------------> <------------------> <------------------>
+ Field: 1 2 3
+
+ Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5
+bits.
+
+ Character set Field 1 Field 2 Field 3
+ ------------- ------- ------- -------
+ ASCII 0 0 PC1
+ range: (00 - 7F)
+ Control-1 0 1 PC1
+ range: (00 - 1F)
+ Dimension-1 official 0 LB - 0x80 PC1
+ range: (01 - 0D) (20 - 7F)
+ Dimension-1 private 0 LB - 0x80 PC1
+ range: (20 - 6F) (20 - 7F)
+ Dimension-2 official LB - 0x8F PC1 PC2
+ range: (01 - 0A) (20 - 7F) (20 - 7F)
+ Dimension-2 private LB - 0xE1 PC1 PC2
+ range: (0F - 1E) (20 - 7F) (20 - 7F)
+ Composite 0x1F ? ?
+
+ Note that character codes 0 - 255 are the same as the "binary
+encoding" described above.
\1f
-File: internals.info, Node: Writing Lisp Primitives, Next: Writing Good Comments, Prev: General Coding Rules, Up: Rules When Writing New C Code
+File: internals.info, Node: CCL, Prev: Internal Mule Encodings, Up: MULE Character Sets and Encodings
-Writing Lisp Primitives
-=======================
+CCL
+===
- Lisp primitives are Lisp functions implemented in C. The details of
-interfacing the C function so that Lisp can call it are handled by a few
-C macros. The only way to really understand how to write new C code is
-to read the source, but we can explain some things here.
-
- An example of a special form is the definition of `prog1', from
-`eval.c'. (An ordinary function would have the same general
-appearance.)
-
- DEFUN ("prog1", Fprog1, 1, UNEVALLED, 0, /*
- Similar to `progn', but the value of the first form is returned.
- \(prog1 FIRST BODY...): All the arguments are evaluated sequentially.
- The value of FIRST is saved during evaluation of the remaining args,
- whose values are discarded.
- */
- (args))
- {
- /* This function can GC */
- REGISTER Lisp_Object val, form, tail;
- struct gcpro gcpro1;
+ CCL PROGRAM SYNTAX:
+ CCL_PROGRAM := (CCL_MAIN_BLOCK
+ [ CCL_EOF_BLOCK ])
- val = Feval (XCAR (args));
+ CCL_MAIN_BLOCK := CCL_BLOCK
+ CCL_EOF_BLOCK := CCL_BLOCK
- GCPRO1 (val);
+ CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...])
+ STATEMENT :=
+ SET | IF | BRANCH | LOOP | REPEAT | BREAK
+ | READ | WRITE
- LIST_LOOP_3 (form, XCDR (args), tail)
- Feval (form);
+ SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION)
+ | INT-OR-CHAR
- UNGCPRO;
- return val;
- }
-
- Let's start with a precise explanation of the arguments to the
-`DEFUN' macro. Here is a template for them:
-
- DEFUN (LNAME, FNAME, MIN_ARGS, MAX_ARGS, INTERACTIVE, /*
- DOCSTRING
- */
- (ARGLIST))
-
-LNAME
- This string is the name of the Lisp symbol to define as the
- function name; in the example above, it is `"prog1"'.
-
-FNAME
- This is the C function name for this function. This is the name
- that is used in C code for calling the function. The name is, by
- convention, `F' prepended to the Lisp name, with all dashes (`-')
- in the Lisp name changed to underscores. Thus, to call this
- function from C code, call `Fprog1'. Remember that the arguments
- are of type `Lisp_Object'; various macros and functions for
- creating values of type `Lisp_Object' are declared in the file
- `lisp.h'.
-
- Primitives whose names are special characters (e.g. `+' or `<')
- are named by spelling out, in some fashion, the special character:
- e.g. `Fplus()' or `Flss()'. Primitives whose names begin with
- normal alphanumeric characters but also contain special characters
- are spelled out in some creative way, e.g. `let*' becomes
- `FletX()'.
-
- Each function also has an associated structure that holds the data
- for the subr object that represents the function in Lisp. This
- structure conveys the Lisp symbol name to the initialization
- routine that will create the symbol and store the subr object as
- its definition. The C variable name of this structure is always
- `S' prepended to the FNAME. You hardly ever need to be aware of
- the existence of this structure, since `DEFUN' plus `DEFSUBR'
- takes care of all the details.
-
-MIN_ARGS
- This is the minimum number of arguments that the function
- requires. The function `prog1' allows a minimum of one argument.
-
-MAX_ARGS
- This is the maximum number of arguments that the function accepts,
- if there is a fixed maximum. Alternatively, it can be `UNEVALLED',
- indicating a special form that receives unevaluated arguments, or
- `MANY', indicating an unlimited number of evaluated arguments (the
- C equivalent of `&rest'). Both `UNEVALLED' and `MANY' are macros.
- If MAX_ARGS is a number, it may not be less than MIN_ARGS and it
- may not be greater than 8. (If you need to add a function with
- more than 8 arguments, use the `MANY' form. Resist the urge to
- edit the definition of `DEFUN' in `lisp.h'. If you do it anyways,
- make sure to also add another clause to the switch statement in
- `primitive_funcall().')
-
-INTERACTIVE
- This is an interactive specification, a string such as might be
- used as the argument of `interactive' in a Lisp function. In the
- case of `prog1', it is 0 (a null pointer), indicating that `prog1'
- cannot be called interactively. A value of `""' indicates a
- function that should receive no arguments when called
- interactively.
-
-DOCSTRING
- This is the documentation string. It is written just like a
- documentation string for a function defined in Lisp; in
- particular, the first line should be a single sentence. Note how
- the documentation string is enclosed in a comment, none of the
- documentation is placed on the same lines as the comment-start and
- comment-end characters, and the comment-start characters are on
- the same line as the interactive specification. `make-docfile',
- which scans the C files for documentation strings, is very
- particular about what it looks for, and will not properly extract
- the doc string if it's not in this exact format.
-
- In order to make both `etags' and `make-docfile' happy, make sure
- that the `DEFUN' line contains the LNAME and FNAME, and that the
- comment-start characters for the doc string are on the same line
- as the interactive specification, and put a newline directly after
- them (and before the comment-end characters).
-
-ARGLIST
- This is the comma-separated list of arguments to the C function.
- For a function with a fixed maximum number of arguments, provide a
- C argument for each Lisp argument. In this case, unlike regular C
- functions, the types of the arguments are not declared; they are
- simply always of type `Lisp_Object'.
-
- The names of the C arguments will be used as the names of the
- arguments to the Lisp primitive as displayed in its documentation,
- modulo the same concerns described above for `F...' names (in
- particular, underscores in the C arguments become dashes in the
- Lisp arguments).
-
- There is one additional kludge: A trailing `_' on the C argument is
- discarded when forming the Lisp argument. This allows C language
- reserved words (like `default') or global symbols (like `dirname')
- to be used as argument names without compiler warnings or errors.
-
- A Lisp function with MAX_ARGS = `UNEVALLED' is a "special form";
- its arguments are not evaluated. Instead it receives one argument
- of type `Lisp_Object', a (Lisp) list of the unevaluated arguments,
- conventionally named `(args)'.
-
- When a Lisp function has no upper limit on the number of arguments,
- specify MAX_ARGS = `MANY'. In this case its implementation in C
- actually receives exactly two arguments: the number of Lisp
- arguments (an `int') and the address of a block containing their
- values (a `Lisp_Object *'). In this case only are the C types
- specified in the ARGLIST: `(int nargs, Lisp_Object *args)'.
-
- Within the function `Fprog1' itself, note the use of the macros
-`GCPRO1' and `UNGCPRO'. `GCPRO1' is used to "protect" a variable from
-garbage collection--to inform the garbage collector that it must look
-in that variable and regard the object pointed at by its contents as an
-accessible object. This is necessary whenever you call `Feval' or
-anything that can directly or indirectly call `Feval' (this includes
-the `QUIT' macro!). At such a time, any Lisp object that you intend to
-refer to again must be protected somehow. `UNGCPRO' cancels the
-protection of the variables that are protected in the current function.
-It is necessary to do this explicitly.
-
- The macro `GCPRO1' protects just one local variable. If you want to
-protect two, use `GCPRO2' instead; repeating `GCPRO1' will not work.
-Macros `GCPRO3' and `GCPRO4' also exist.
-
- These macros implicitly use local variables such as `gcpro1'; you
-must declare these explicitly, with type `struct gcpro'. Thus, if you
-use `GCPRO2', you must declare `gcpro1' and `gcpro2'.
-
- Note also that the general rule is "caller-protects"; i.e. you are
-only responsible for protecting those Lisp objects that you create. Any
-objects passed to you as arguments should have been protected by whoever
-created them, so you don't in general have to protect them.
-
- In particular, the arguments to any Lisp primitive are always
-automatically `GCPRO'ed, when called "normally" from Lisp code or
-bytecode. So only a few Lisp primitives that are called frequently from
-C code, such as `Fprogn' protect their arguments as a service to their
-caller. You don't need to protect your arguments when writing a new
-`DEFUN'.
-
- `GCPRO'ing is perhaps the trickiest and most error-prone part of
-XEmacs coding. It is *extremely* important that you get this right and
-use a great deal of discipline when writing this code. *Note
-`GCPRO'ing: GCPROing, for full details on how to do this.
-
- What `DEFUN' actually does is declare a global structure of type
-`Lisp_Subr' whose name begins with capital `SF' and which contains
-information about the primitive (e.g. a pointer to the function, its
-minimum and maximum allowed arguments, a string describing its Lisp
-name); `DEFUN' then begins a normal C function declaration using the
-`F...' name. The Lisp subr object that is the function definition of a
-primitive (i.e. the object in the function slot of the symbol that
-names the primitive) actually points to this `SF' structure; when
-`Feval' encounters a subr, it looks in the structure to find out how to
-call the C function.
-
- Defining the C function is not enough to make a Lisp primitive
-available; you must also create the Lisp symbol for the primitive (the
-symbol is "interned"; *note Obarrays::) and store a suitable subr
-object in its function cell. (If you don't do this, the primitive won't
-be seen by Lisp code.) The code looks like this:
-
- DEFSUBR (FNAME);
-
-Here FNAME is the same name you used as the second argument to `DEFUN'.
-
- This call to `DEFSUBR' should go in the `syms_of_*()' function at
-the end of the module. If no such function exists, create it and make
-sure to also declare it in `symsinit.h' and call it from the
-appropriate spot in `main()'. *Note General Coding Rules::.
-
- Note that C code cannot call functions by name unless they are
-defined in C. The way to call a function written in Lisp from C is to
-use `Ffuncall', which embodies the Lisp function `funcall'. Since the
-Lisp function `funcall' accepts an unlimited number of arguments, in C
-it takes two: the number of Lisp-level arguments, and a one-dimensional
-array containing their values. The first Lisp-level argument is the
-Lisp function to call, and the rest are the arguments to pass to it.
-Since `Ffuncall' can call the evaluator, you must protect pointers from
-garbage collection around the call to `Ffuncall'. (However, `Ffuncall'
-explicitly protects all of its parameters, so you don't have to protect
-any pointers passed as parameters to it.)
-
- The C functions `call0', `call1', `call2', and so on, provide handy
-ways to call a Lisp function conveniently with a fixed number of
-arguments. They work by calling `Ffuncall'.
-
- `eval.c' is a very good file to look through for examples; `lisp.h'
-contains the definitions for important macros and functions.
+ EXPRESSION := ARG | (EXPRESSION OP ARG)
+
+ IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK)
+ BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
+ LOOP := (loop STATEMENT [STATEMENT ...])
+ BREAK := (break)
+ REPEAT := (repeat)
+ | (write-repeat [REG | INT-OR-CHAR | string])
+ | (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?)
+ READ := (read REG) | (read REG REG)
+ | (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK)
+ | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
+ WRITE := (write REG) | (write REG REG)
+ | (write INT-OR-CHAR) | (write STRING) | STRING
+ | (write REG ARRAY)
+ END := (end)
+
+ REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
+ ARG := REG | INT-OR-CHAR
+ OP := + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
+ | < | > | == | <= | >= | !=
+ SELF_OP :=
+ += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
+ ARRAY := '[' INT-OR-CHAR ... ']'
+ INT-OR-CHAR := INT | CHAR
+
+ MACHINE CODE:
+
+ The machine code consists of a vector of 32-bit words.
+ The first such word specifies the start of the EOF section of the code;
+ this is the code executed to handle any stuff that needs to be done
+ (e.g. designating back to ASCII and left-to-right mode) after all
+ other encoded/decoded data has been written out. This is not used for
+ charset CCL programs.
+
+ REGISTER: 0..7 -- referred by RRR or rrr
+
+ OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT
+ TTTTT (5-bit): operator type
+ RRR (3-bit): register number
+ XXXXXXXXXXXXXXXX (15-bit):
+ CCCCCCCCCCCCCCC: constant or address
+ 000000000000rrr: register number
+
+ AAAA: 00000 +
+ 00001 -
+ 00010 *
+ 00011 /
+ 00100 %
+ 00101 &
+ 00110 |
+ 00111 ~
+
+ 01000 <<
+ 01001 >>
+ 01010 <8
+ 01011 >8
+ 01100 //
+ 01101 not used
+ 01110 not used
+ 01111 not used
+
+ 10000 <
+ 10001 >
+ 10010 ==
+ 10011 <=
+ 10100 >=
+ 10101 !=
+
+ OPERATORS: TTTTT RRR XX..
+
+ SetCS: 00000 RRR C...C RRR = C...C
+ SetCL: 00001 RRR ..... RRR = c...c
+ c.............c
+ SetR: 00010 RRR ..rrr RRR = rrr
+ SetA: 00011 RRR ..rrr RRR = array[rrr]
+ C.............C size of array = C...C
+ c.............c contents = c...c
+
+ Jump: 00100 000 c...c jump to c...c
+ JumpCond: 00101 RRR c...c if (!RRR) jump to c...c
+ WriteJump: 00110 RRR c...c Write1 RRR, jump to c...c
+ WriteReadJump: 00111 RRR c...c Write1, Read1 RRR, jump to c...c
+ WriteCJump: 01000 000 c...c Write1 C...C, jump to c...c
+ C...C
+ WriteCReadJump: 01001 RRR c...c Write1 C...C, Read1 RRR,
+ C.............C and jump to c...c
+ WriteSJump: 01010 000 c...c WriteS, jump to c...c
+ C.............C
+ S.............S
+ ...
+ WriteSReadJump: 01011 RRR c...c WriteS, Read1 RRR, jump to c...c
+ C.............C
+ S.............S
+ ...
+ WriteAReadJump: 01100 RRR c...c WriteA, Read1 RRR, jump to c...c
+ C.............C size of array = C...C
+ c.............c contents = c...c
+ ...
+ Branch: 01101 RRR C...C if (RRR >= 0 && RRR < C..)
+ c.............c branch to (RRR+1)th address
+ Read1: 01110 RRR ... read 1-byte to RRR
+ Read2: 01111 RRR ..rrr read 2-byte to RRR and rrr
+ ReadBranch: 10000 RRR C...C Read1 and Branch
+ c.............c
+ ...
+ Write1: 10001 RRR ..... write 1-byte RRR
+ Write2: 10010 RRR ..rrr write 2-byte RRR and rrr
+ WriteC: 10011 000 ..... write 1-char C...CC
+ C.............C
+ WriteS: 10100 000 ..... write C..-byte of string
+ C.............C
+ S.............S
+ ...
+ WriteA: 10101 RRR ..... write array[RRR]
+ C.............C size of array = C...C
+ c.............c contents = c...c
+ ...
+ End: 10110 000 ..... terminate the execution
+
+ SetSelfCS: 10111 RRR C...C RRR AAAAA= C...C
+ ..........AAAAA
+ SetSelfCL: 11000 RRR ..... RRR AAAAA= c...c
+ c.............c
+ ..........AAAAA
+ SetSelfR: 11001 RRR ..Rrr RRR AAAAA= rrr
+ ..........AAAAA
+ SetExprCL: 11010 RRR ..Rrr RRR = rrr AAAAA c...c
+ c.............c
+ ..........AAAAA
+ SetExprR: 11011 RRR ..rrr RRR = rrr AAAAA Rrr
+ ............Rrr
+ ..........AAAAA
+ JumpCondC: 11100 RRR c...c if !(RRR AAAAA C..) jump to c...c
+ C.............C
+ ..........AAAAA
+ JumpCondR: 11101 RRR c...c if !(RRR AAAAA rrr) jump to c...c
+ ............rrr
+ ..........AAAAA
+ ReadJumpCondC: 11110 RRR c...c Read1 and JumpCondC
+ C.............C
+ ..........AAAAA
+ ReadJumpCondR: 11111 RRR c...c Read1 and JumpCondR
+ ............rrr
+ ..........AAAAA
\1f
-File: internals.info, Node: Writing Good Comments, Next: Adding Global Lisp Variables, Prev: Writing Lisp Primitives, Up: Rules When Writing New C Code
-
-Writing Good Comments
-=====================
-
- Comments are a lifeline for programmers trying to understand tricky
-code. In general, the less obvious it is what you are doing, the more
-you need a comment, and the more detailed it needs to be. You should
-always be on guard when you're writing code for stuff that's tricky, and
-should constantly be putting yourself in someone else's shoes and asking
-if that person could figure out without much difficulty what's going
-on. (Assume they are a competent programmer who understands the
-essentials of how the XEmacs code is structured but doesn't know much
-about the module you're working on or any algorithms you're using.) If
-you're not sure whether they would be able to, add a comment. Always
-err on the side of more comments, rather than less.
-
- Generally, when making comments, there is no need to attribute them
-with your name or initials. This especially goes for small,
-easy-to-understand, non-opinionated ones. Also, comments indicating
-where, when, and by whom a file was changed are _strongly_ discouraged,
-and in general will be removed as they are discovered. This is exactly
-what `ChangeLogs' are there for. However, it can occasionally be
-useful to mark exactly where (but not when or by whom) changes are
-made, particularly when making small changes to a file imported from
-elsewhere. These marks help when later on a newer version of the file
-is imported and the changes need to be merged. (If everything were
-always kept in CVS, there would be no need for this. But in practice,
-this often doesn't happen, or the CVS repository is later on lost or
-unavailable to the person doing the update.)
-
- When putting in an explicit opinion in a comment, you should
-_always_ attribute it with your name, and optionally the date. This
-also goes for long, complex comments explaining in detail the workings
-of something - by putting your name there, you make it possible for
-someone who has questions about how that thing works to determine who
-wrote the comment so they can write to them. Preferably, use your
-actual name and not your initials, unless your initials are generally
-recognized (e.g. `jwz'). You can use only your first name if it's
-obvious who you are; otherwise, give first and last name. If you're
-not a regular contributor, you might consider putting your email
-address in - it may be in the ChangeLog, but after awhile ChangeLogs
-have a tendency of disappearing or getting muddled. (E.g. your comment
-may get copied somewhere else or even into another program, and
-tracking down the proper ChangeLog may be very difficult.)
-
- If you come across an opinion that is not or no longer valid, or you
-come across any comment that no longer applies but you want to keep it
-around, enclose it in `[[ ' and ` ]]' marks and add a comment
-afterwards explaining why the preceding comment is no longer valid. Put
-your name on this comment, as explained above.
-
- Just as comments are a lifeline to programmers, incorrect comments
-are death. If you come across an incorrect comment, *immediately*
-correct it or flag it as incorrect, as described in the previous
-paragraph. Whenever you work on a section of code, _always_ make sure
-to update any comments to be correct - or, at the very least, flag them
-as incorrect.
-
- To indicate a "todo" or other problem, use four pound signs - i.e.
-`####'.
+File: internals.info, Node: The Lisp Reader and Compiler, Next: Lstreams, Prev: MULE Character Sets and Encodings, Up: Top
+
+The Lisp Reader and Compiler
+****************************
+
+Not yet documented.
\1f
-File: internals.info, Node: Adding Global Lisp Variables, Next: Proper Use of Unsigned Types, Prev: Writing Good Comments, Up: Rules When Writing New C Code
-
-Adding Global Lisp Variables
-============================
-
- Global variables whose names begin with `Q' are constants whose
-value is a symbol of a particular name. The name of the variable should
-be derived from the name of the symbol using the same rules as for Lisp
-primitives. These variables are initialized using a call to
-`defsymbol()' in the `syms_of_*()' function. (This call interns a
-symbol, sets the C variable to the resulting Lisp object, and calls
-`staticpro()' on the C variable to tell the garbage-collection
-mechanism about this variable. What `staticpro()' does is add a
-pointer to the variable to a large global array; when
-garbage-collection happens, all pointers listed in the array are used
-as starting points for marking Lisp objects. This is important because
-it's quite possible that the only current reference to the object is
-the C variable. In the case of symbols, the `staticpro()' doesn't
-matter all that much because the symbol is contained in `obarray',
-which is itself `staticpro()'ed. However, it's possible that a naughty
-user could do something like uninterning the symbol out of `obarray' or
-even setting `obarray' to a different value [although this is likely to
-make XEmacs crash!].)
-
- *Please note:* It is potentially deadly if you declare a `Q...'
-variable in two different modules. The two calls to `defsymbol()' are
-no problem, but some linkers will complain about multiply-defined
-symbols. The most insidious aspect of this is that often the link will
-succeed anyway, but then the resulting executable will sometimes crash
-in obscure ways during certain operations! To avoid this problem,
-declare any symbols with common names (such as `text') that are not
-obviously associated with this particular module in the module
-`general.c'.
-
- Global variables whose names begin with `V' are variables that
-contain Lisp objects. The convention here is that all global variables
-of type `Lisp_Object' begin with `V', and all others don't (including
-integer and boolean variables that have Lisp equivalents). Most of the
-time, these variables have equivalents in Lisp, but some don't. Those
-that do are declared this way by a call to `DEFVAR_LISP()' in the
-`vars_of_*()' initializer for the module. What this does is create a
-special "symbol-value-forward" Lisp object that contains a pointer to
-the C variable, intern a symbol whose name is as specified in the call
-to `DEFVAR_LISP()', and set its value to the symbol-value-forward Lisp
-object; it also calls `staticpro()' on the C variable to tell the
-garbage-collection mechanism about the variable. When `eval' (or
-actually `symbol-value') encounters this special object in the process
-of retrieving a variable's value, it follows the indirection to the C
-variable and gets its value. `setq' does similar things so that the C
-variable gets changed.
-
- Whether or not you `DEFVAR_LISP()' a variable, you need to
-initialize it in the `vars_of_*()' function; otherwise it will end up
-as all zeroes, which is the integer 0 (_not_ `nil'), and this is
-probably not what you want. Also, if the variable is not
-`DEFVAR_LISP()'ed, *you must call* `staticpro()' on the C variable in
-the `vars_of_*()' function. Otherwise, the garbage-collection
-mechanism won't know that the object in this variable is in use, and
-will happily collect it and reuse its storage for another Lisp object,
-and you will be the one who's unhappy when you can't figure out how
-your variable got overwritten.
+File: internals.info, Node: Lstreams, Next: Consoles; Devices; Frames; Windows, Prev: The Lisp Reader and Compiler, Up: Top
+
+Lstreams
+********
+
+An "lstream" is an internal Lisp object that provides a generic
+buffering stream implementation. Conceptually, you send data to the
+stream or read data from the stream, not caring what's on the other end
+of the stream. The other end could be another stream, a file
+descriptor, a stdio stream, a fixed block of memory, a reallocating
+block of memory, etc. The main purpose of the stream is to provide a
+standard interface and to do buffering. Macros are defined to read or
+write characters, so the calling functions do not have to worry about
+blocking data together in order to achieve efficiency.
+
+* Menu:
+
+* Creating an Lstream:: Creating an lstream object.
+* Lstream Types:: Different sorts of things that are streamed.
+* Lstream Functions:: Functions for working with lstreams.
+* Lstream Methods:: Creating new lstream types.
+
+\1f
+File: internals.info, Node: Creating an Lstream, Next: Lstream Types, Up: Lstreams
+
+Creating an Lstream
+===================
+
+Lstreams come in different types, depending on what is being interfaced
+to. Although the primitive for creating new lstreams is
+`Lstream_new()', generally you do not call this directly. Instead, you
+call some type-specific creation function, which creates the lstream
+and initializes it as appropriate for the particular type.
+
+ All lstream creation functions take a MODE argument, specifying what
+mode the lstream should be opened as. This controls whether the
+lstream is for input and output, and optionally whether data should be
+blocked up in units of MULE characters. Note that some types of
+lstreams can only be opened for input; others only for output; and
+others can be opened either way. #### Richard Mlynarik thinks that
+there should be a strict separation between input and output streams,
+and he's probably right.
+
+ MODE is a string, one of
+
+`"r"'
+ Open for reading.
+
+`"w"'
+ Open for writing.
+
+`"rc"'
+ Open for reading, but "read" never returns partial MULE characters.
+
+`"wc"'
+ Open for writing, but never writes partial MULE characters.
\1f
-File: internals.info, Node: Proper Use of Unsigned Types, Next: Coding for Mule, Prev: Adding Global Lisp Variables, Up: Rules When Writing New C Code
+File: internals.info, Node: Lstream Types, Next: Lstream Functions, Prev: Creating an Lstream, Up: Lstreams
+
+Lstream Types
+=============
+
+stdio
+
+filedesc
-Proper Use of Unsigned Types
-============================
+lisp-string
- Avoid using `unsigned int' and `unsigned long' whenever possible.
-Unsigned types are viral - any arithmetic or comparisons involving
-mixed signed and unsigned types are automatically converted to
-unsigned, which is almost certainly not what you want. Many subtle and
-hard-to-find bugs are created by careless use of unsigned types. In
-general, you should almost _never_ use an unsigned type to hold a
-regular quantity of any sort. The only exceptions are
+fixed-buffer
- 1. When there's a reasonable possibility you will actually need all
- 32 or 64 bits to store the quantity.
+resizing-buffer
- 2. When calling existing API's that require unsigned types. In this
- case, you should still do all manipulation using signed types, and
- do the conversion at the very threshold of the API call.
+dynarr
- 3. In existing code that you don't want to modify because you don't
- maintain it.
+lisp-buffer
- 4. In bit-field structures.
+print
- Other reasonable uses of `unsigned int' and `unsigned long' are
-representing non-quantities - e.g. bit-oriented flags and such.
+decoding
+
+encoding
+
+\1f
+File: internals.info, Node: Lstream Functions, Next: Lstream Methods, Prev: Lstream Types, Up: Lstreams
+
+Lstream Functions
+=================
+
+ - Function: Lstream * Lstream_new (Lstream_implementation *IMP, const
+ char *MODE)
+ Allocate and return a new Lstream. This function is not really
+ meant to be called directly; rather, each stream type should
+ provide its own stream creation function, which creates the stream
+ and does any other necessary creation stuff (e.g. opening a file).
+
+ - Function: void Lstream_set_buffering (Lstream *LSTR,
+ Lstream_buffering BUFFERING, int BUFFERING_SIZE)
+ Change the buffering of a stream. See `lstream.h'. By default the
+ buffering is `STREAM_BLOCK_BUFFERED'.
+
+ - Function: int Lstream_flush (Lstream *LSTR)
+ Flush out any pending unwritten data in the stream. Clear any
+ buffered input data. Returns 0 on success, -1 on error.
+
+ - Macro: int Lstream_putc (Lstream *STREAM, int C)
+ Write out one byte to the stream. This is a macro and so it is
+ very efficient. The C argument is only evaluated once but the
+ STREAM argument is evaluated more than once. Returns 0 on
+ success, -1 on error.
+
+ - Macro: int Lstream_getc (Lstream *STREAM)
+ Read one byte from the stream. This is a macro and so it is very
+ efficient. The STREAM argument is evaluated more than once.
+ Return value is -1 for EOF or error.
+
+ - Macro: void Lstream_ungetc (Lstream *STREAM, int C)
+ Push one byte back onto the input queue. This will be the next
+ byte read from the stream. Any number of bytes can be pushed back
+ and will be read in the reverse order they were pushed back--most
+ recent first. (This is necessary for consistency--if there are a
+ number of bytes that have been unread and I read and unread a
+ byte, it needs to be the first to be read again.) This is a macro
+ and so it is very efficient. The C argument is only evaluated
+ once but the STREAM argument is evaluated more than once.
+
+ - Function: int Lstream_fputc (Lstream *STREAM, int C)
+ - Function: int Lstream_fgetc (Lstream *STREAM)
+ - Function: void Lstream_fungetc (Lstream *STREAM, int C)
+ Function equivalents of the above macros.
+
+ - Function: ssize_t Lstream_read (Lstream *STREAM, void *DATA, size_t
+ SIZE)
+ Read SIZE bytes of DATA from the stream. Return the number of
+ bytes read. 0 means EOF. -1 means an error occurred and no bytes
+ were read.
+
+ - Function: ssize_t Lstream_write (Lstream *STREAM, void *DATA, size_t
+ SIZE)
+ Write SIZE bytes of DATA to the stream. Return the number of
+ bytes written. -1 means an error occurred and no bytes were
+ written.
+
+ - Function: void Lstream_unread (Lstream *STREAM, void *DATA, size_t
+ SIZE)
+ Push back SIZE bytes of DATA onto the input queue. The next call
+ to `Lstream_read()' with the same size will read the same bytes
+ back. Note that this will be the case even if there is other
+ pending unread data.
+
+ - Function: int Lstream_close (Lstream *STREAM)
+ Close the stream. All data will be flushed out.
+
+ - Function: void Lstream_reopen (Lstream *STREAM)
+ Reopen a closed stream. This enables I/O on it again. This is not
+ meant to be called except from a wrapper routine that reinitializes
+ variables and such--the close routine may well have freed some
+ necessary storage structures, for example.
+
+ - Function: void Lstream_rewind (Lstream *STREAM)
+ Rewind the stream to the beginning.
\1f
-File: internals.info, Node: Coding for Mule, Next: Techniques for XEmacs Developers, Prev: Proper Use of Unsigned Types, Up: Rules When Writing New C Code
+File: internals.info, Node: Lstream Methods, Prev: Lstream Functions, Up: Lstreams
-Coding for Mule
+Lstream Methods
===============
- Although Mule support is not compiled by default in XEmacs, many
-people are using it, and we consider it crucial that new code works
-correctly with multibyte characters. This is not hard; it is only a
-matter of following several simple user-interface guidelines. Even if
-you never compile with Mule, with a little practice you will find it
-quite easy to code Mule-correctly.
+ - Lstream Method: ssize_t reader (Lstream *STREAM, unsigned char
+ *DATA, size_t SIZE)
+ Read some data from the stream's end and store it into DATA, which
+ can hold SIZE bytes. Return the number of bytes read. A return
+ value of 0 means no bytes can be read at this time. This may be
+ because of an EOF, or because there is a granularity greater than
+ one byte that the stream imposes on the returned data, and SIZE is
+ less than this granularity. (This will happen frequently for
+ streams that need to return whole characters, because
+ `Lstream_read()' calls the reader function repeatedly until it has
+ the number of bytes it wants or until 0 is returned.) The lstream
+ functions do not treat a 0 return as EOF or do anything special;
+ however, the calling function will interpret any 0 it gets back as
+ EOF. This will normally not happen unless the caller calls
+ `Lstream_read()' with a very small size.
+
+ This function can be `NULL' if the stream is output-only.
+
+ - Lstream Method: ssize_t writer (Lstream *STREAM, const unsigned char
+ *DATA, size_t SIZE)
+ Send some data to the stream's end. Data to be sent is in DATA
+ and is SIZE bytes. Return the number of bytes sent. This
+ function can send and return fewer bytes than is passed in; in that
+ case, the function will just be called again until there is no
+ data left or 0 is returned. A return value of 0 means that no
+ more data can be currently stored, but there is no error; the data
+ will be squirreled away until the writer can accept data. (This is
+ useful, e.g., if you're dealing with a non-blocking file
+ descriptor and are getting `EWOULDBLOCK' errors.) This function
+ can be `NULL' if the stream is input-only.
+
+ - Lstream Method: int rewinder (Lstream *STREAM)
+ Rewind the stream. If this is `NULL', the stream is not seekable.
+
+ - Lstream Method: int seekable_p (Lstream *STREAM)
+ Indicate whether this stream is seekable--i.e. it can be rewound.
+ This method is ignored if the stream does not have a rewind
+ method. If this method is not present, the result is determined
+ by whether a rewind method is present.
+
+ - Lstream Method: int flusher (Lstream *STREAM)
+ Perform any additional operations necessary to flush the data in
+ this stream.
+
+ - Lstream Method: int pseudo_closer (Lstream *STREAM)
+
+ - Lstream Method: int closer (Lstream *STREAM)
+ Perform any additional operations necessary to close this stream
+ down. May be `NULL'. This function is called when
+ `Lstream_close()' is called or when the stream is
+ garbage-collected. When this function is called, all pending data
+ in the stream will already have been written out.
+
+ - Lstream Method: Lisp_Object marker (Lisp_Object LSTREAM, void
+ (*MARKFUN) (Lisp_Object))
+ Mark this object for garbage collection. Same semantics as a
+ standard `Lisp_Object' marker. This function can be `NULL'.
+
+\1f
+File: internals.info, Node: Consoles; Devices; Frames; Windows, Next: The Redisplay Mechanism, Prev: Lstreams, Up: Top
+
+Consoles; Devices; Frames; Windows
+**********************************
+
+* Menu:
+
+* Introduction to Consoles; Devices; Frames; Windows::
+* Point::
+* Window Hierarchy::
+* The Window Object::
+
+\1f
+File: internals.info, Node: Introduction to Consoles; Devices; Frames; Windows, Next: Point, Up: Consoles; Devices; Frames; Windows
+
+Introduction to Consoles; Devices; Frames; Windows
+==================================================
+
+A window-system window that you see on the screen is called a "frame"
+in Emacs terminology. Each frame is subdivided into one or more
+non-overlapping panes, called (confusingly) "windows". Each window
+displays the text of a buffer in it. (See above on Buffers.) Note that
+buffers and windows are independent entities: Two or more windows can
+be displaying the same buffer (potentially in different locations), and
+a buffer can be displayed in no windows.
+
+ A single display screen that contains one or more frames is called a
+"display". Under most circumstances, there is only one display.
+However, more than one display can exist, for example if you have a
+"multi-headed" console, i.e. one with a single keyboard but multiple
+displays. (Typically in such a situation, the various displays act like
+one large display, in that the mouse is only in one of them at a time,
+and moving the mouse off of one moves it into another.) In some cases,
+the different displays will have different characteristics, e.g. one
+color and one mono.
+
+ XEmacs can display frames on multiple displays. It can even deal
+simultaneously with frames on multiple keyboards (called "consoles" in
+XEmacs terminology). Here is one case where this might be useful: You
+are using XEmacs on your workstation at work, and leave it running.
+Then you go home and dial in on a TTY line, and you can use the
+already-running XEmacs process to display another frame on your local
+TTY.
+
+ Thus, there is a hierarchy console -> display -> frame -> window.
+There is a separate Lisp object type for each of these four concepts.
+Furthermore, there is logically a "selected console", "selected
+display", "selected frame", and "selected window". Each of these
+objects is distinguished in various ways, such as being the default
+object for various functions that act on objects of that type. Note
+that every containing object remembers the "selected" object among the
+objects that it contains: e.g. not only is there a selected window, but
+every frame remembers the last window in it that was selected, and
+changing the selected frame causes the remembered window within it to
+become the selected window. Similar relationships apply for consoles
+to devices and devices to frames.
+
+\1f
+File: internals.info, Node: Point, Next: Window Hierarchy, Prev: Introduction to Consoles; Devices; Frames; Windows, Up: Consoles; Devices; Frames; Windows
+
+Point
+=====
+
+Recall that every buffer has a current insertion position, called
+"point". Now, two or more windows may be displaying the same buffer,
+and the text cursor in the two windows (i.e. `point') can be in two
+different places. You may ask, how can that be, since each buffer has
+only one value of `point'? The answer is that each window also has a
+value of `point' that is squirreled away in it. There is only one
+selected window, and the value of "point" in that buffer corresponds to
+that window. When the selected window is changed from one window to
+another displaying the same buffer, the old value of `point' is stored
+into the old window's "point" and the value of `point' from the new
+window is retrieved and made the value of `point' in the buffer. This
+means that `window-point' for the selected window is potentially
+inaccurate, and if you want to retrieve the correct value of `point'
+for a window, you must special-case on the selected window and retrieve
+the buffer's point instead. This is related to why
+`save-window-excursion' does not save the selected window's value of
+`point'.
+
+\1f
+File: internals.info, Node: Window Hierarchy, Next: The Window Object, Prev: Point, Up: Consoles; Devices; Frames; Windows
+
+Window Hierarchy
+================
+
+If a frame contains multiple windows (panes), they are always created
+by splitting an existing window along the horizontal or vertical axis.
+Terminology is a bit confusing here: to "split a window horizontally"
+means to create two side-by-side windows, i.e. to make a _vertical_ cut
+in a window. Likewise, to "split a window vertically" means to create
+two windows, one above the other, by making a _horizontal_ cut.
+
+ If you split a window and then split again along the same axis, you
+will end up with a number of panes all arranged along the same axis.
+The precise way in which the splits were made should not be important,
+and this is reflected internally. Internally, all windows are arranged
+in a tree, consisting of two types of windows, "combination" windows
+(which have children, and are covered completely by those children) and
+"leaf" windows, which have no children and are visible. Every
+combination window has two or more children, all arranged along the same
+axis. There are (logically) two subtypes of windows, depending on
+whether their children are horizontally or vertically arrayed. There is
+always one root window, which is either a leaf window (if the frame
+contains only one window) or a combination window (if the frame contains
+more than one window). In the latter case, the root window will have
+two or more children, either horizontally or vertically arrayed, and
+each of those children will be either a leaf window or another
+combination window.
+
+ Here are some rules:
+
+ 1. Horizontal combination windows can never have children that are
+ horizontal combination windows; same for vertical.
+
+ 2. Only leaf windows can be split (obviously) and this splitting does
+ one of two things: (a) turns the leaf window into a combination
+ window and creates two new leaf children, or (b) turns the leaf
+ window into one of the two new leaves and creates the other leaf.
+ Rule (1) dictates which of these two outcomes happens.
+
+ 3. Every combination window must have at least two children.
+
+ 4. Leaf windows can never become combination windows. They can be
+ deleted, however. If this results in a violation of (3), the
+ parent combination window also gets deleted.
+
+ 5. All functions that accept windows must be prepared to accept
+ combination windows, and do something sane (e.g. signal an error
+ if so). Combination windows _do_ escape to the Lisp level.
+
+ 6. All windows have three fields governing their contents: these are
+ "hchild" (a list of horizontally-arrayed children), "vchild" (a
+ list of vertically-arrayed children), and "buffer" (the buffer
+ contained in a leaf window). Exactly one of these will be
+ non-`nil'. Remember that "horizontally-arrayed" means
+ "side-by-side" and "vertically-arrayed" means "one above the
+ other".
+
+ 7. Leaf windows also have markers in their `start' (the first buffer
+ position displayed in the window) and `pointm' (the window's
+ stashed value of `point'--see above) fields, while combination
+ windows have `nil' in these fields.
+
+ 8. The list of children for a window is threaded through the `next'
+ and `prev' fields of each child window.
+
+ 9. *Deleted windows can be undeleted*. This happens as a result of
+ restoring a window configuration, and is unlike frames, displays,
+ and consoles, which, once deleted, can never be restored.
+ Deleting a window does nothing except set a special `dead' bit to
+ 1 and clear out the `next', `prev', `hchild', and `vchild' fields,
+ for GC purposes.
+
+ 10. Most frames actually have two top-level windows--one for the
+ minibuffer and one (the "root") for everything else. The modeline
+ (if present) separates these two. The `next' field of the root
+ points to the minibuffer, and the `prev' field of the minibuffer
+ points to the root. The other `next' and `prev' fields are `nil',
+ and the frame points to both of these windows. Minibuffer-less
+ frames have no minibuffer window, and the `next' and `prev' of the
+ root window are `nil'. Minibuffer-only frames have no root
+ window, and the `next' of the minibuffer window is `nil' but the
+ `prev' points to itself. (#### This is an artifact that should be
+ fixed.)
+
+\1f
+File: internals.info, Node: The Window Object, Prev: Window Hierarchy, Up: Consoles; Devices; Frames; Windows
+
+The Window Object
+=================
+
+Windows have the following accessible fields:
+
+`frame'
+ The frame that this window is on.
+
+`mini_p'
+ Non-`nil' if this window is a minibuffer window.
+
+`buffer'
+ The buffer that the window is displaying. This may change often
+ during the life of the window.
+
+`dedicated'
+ Non-`nil' if this window is dedicated to its buffer.
+
+`pointm'
+ This is the value of point in the current buffer when this window
+ is selected; when it is not selected, it retains its previous
+ value.
+
+`start'
+ The position in the buffer that is the first character to be
+ displayed in the window.
+
+`force_start'
+ If this flag is non-`nil', it says that the window has been
+ scrolled explicitly by the Lisp program. This affects what the
+ next redisplay does if point is off the screen: instead of
+ scrolling the window to show the text around point, it moves point
+ to a location that is on the screen.
+
+`last_modified'
+ The `modified' field of the window's buffer, as of the last time a
+ redisplay completed in this window.
+
+`last_point'
+ The buffer's value of point, as of the last time a redisplay
+ completed in this window.
+
+`left'
+ This is the left-hand edge of the window, measured in columns.
+ (The leftmost column on the screen is column 0.)
+
+`top'
+ This is the top edge of the window, measured in lines. (The top
+ line on the screen is line 0.)
+
+`height'
+ The height of the window, measured in lines.
+
+`width'
+ The width of the window, measured in columns.
+
+`next'
+ This is the window that is the next in the chain of siblings. It
+ is `nil' in a window that is the rightmost or bottommost of a
+ group of siblings.
+
+`prev'
+ This is the window that is the previous in the chain of siblings.
+ It is `nil' in a window that is the leftmost or topmost of a group
+ of siblings.
+
+`parent'
+ Internally, XEmacs arranges windows in a tree; each group of
+ siblings has a parent window whose area includes all the siblings.
+ This field points to a window's parent.
+
+ Parent windows do not display buffers, and play little role in
+ display except to shape their child windows. Emacs Lisp programs
+ usually have no access to the parent windows; they operate on the
+ windows at the leaves of the tree, which actually display buffers.
+
+`hscroll'
+ This is the number of columns that the display in the window is
+ scrolled horizontally to the left. Normally, this is 0.
+
+`use_time'
+ This is the last time that the window was selected. The function
+ `get-lru-window' uses this field.
+
+`display_table'
+ The window's display table, or `nil' if none is specified for it.
+
+`update_mode_line'
+ Non-`nil' means this window's mode line needs to be updated.
+
+`base_line_number'
+ The line number of a certain position in the buffer, or `nil'.
+ This is used for displaying the line number of point in the mode
+ line.
+
+`base_line_pos'
+ The position in the buffer for which the line number is known, or
+ `nil' meaning none is known.
+
+`region_showing'
+ If the region (or part of it) is highlighted in this window, this
+ field holds the mark position that made one end of that region.
+ Otherwise, this field is `nil'.
+
+\1f
+File: internals.info, Node: The Redisplay Mechanism, Next: Extents, Prev: Consoles; Devices; Frames; Windows, Up: Top
- Note that these guidelines are not necessarily tied to the current
-Mule implementation; they are also a good idea to follow on the grounds
-of code generalization for future I18N work.
+The Redisplay Mechanism
+***********************
+
+The redisplay mechanism is one of the most complicated sections of
+XEmacs, especially from a conceptual standpoint. This is doubly so
+because, unlike for the basic aspects of the Lisp interpreter, the
+computer science theories of how to efficiently handle redisplay are not
+well-developed.
+
+ When working with the redisplay mechanism, remember the Golden Rules
+of Redisplay:
+
+ 1. It Is Better To Be Correct Than Fast.
+
+ 2. Thou Shalt Not Run Elisp From Within Redisplay.
+
+ 3. It Is Better To Be Fast Than Not To Be.
* Menu:
-* Character-Related Data Types::
-* Working With Character and Byte Positions::
-* Conversion to and from External Data::
-* General Guidelines for Writing Mule-Aware Code::
-* An Example of Mule-Aware Code::
+* Critical Redisplay Sections::
+* Line Start Cache::
+* Redisplay Piece by Piece::
\1f
-File: internals.info, Node: Character-Related Data Types, Next: Working With Character and Byte Positions, Up: Coding for Mule
+File: internals.info, Node: Critical Redisplay Sections, Next: Line Start Cache, Up: The Redisplay Mechanism
+
+Critical Redisplay Sections
+===========================
+
+Within this section, we are defenseless and assume that the following
+cannot happen:
+
+ 1. garbage collection
+
+ 2. Lisp code evaluation
+
+ 3. frame size changes
+
+ We ensure (3) by calling `hold_frame_size_changes()', which will
+cause any pending frame size changes to get put on hold till after the
+end of the critical section. (1) follows automatically if (2) is met.
+#### Unfortunately, there are some places where Lisp code can be called
+within this section. We need to remove them.
+
+ If `Fsignal()' is called during this critical section, we will
+`abort()'.
+
+ If garbage collection is called during this critical section, we
+simply return. #### We should abort instead.
+
+ #### If a frame-size change does occur we should probably actually
+be preempting redisplay.
+
+\1f
+File: internals.info, Node: Line Start Cache, Next: Redisplay Piece by Piece, Prev: Critical Redisplay Sections, Up: The Redisplay Mechanism
+
+Line Start Cache
+================
+
+The traditional scrolling code in Emacs breaks in a variable height
+world. It depends on the key assumption that the number of lines that
+can be displayed at any given time is fixed. This led to a complete
+separation of the scrolling code from the redisplay code. In order to
+fully support variable height lines, the scrolling code must actually be
+tightly integrated with redisplay. Only redisplay can determine how
+many lines will be displayed on a screen for any given starting point.
+
+ What is ideally wanted is a complete list of the starting buffer
+position for every possible display line of a buffer along with the
+height of that display line. Maintaining such a full list would be very
+expensive. We settle for having it include information for all areas
+which we happen to generate anyhow (i.e. the region currently being
+displayed) and for those areas we need to work with.
+
+ In order to ensure that the cache accurately represents what
+redisplay would actually show, it is necessary to invalidate it in many
+situations. If the buffer changes, the starting positions may no longer
+be correct. If a face or an extent has changed then the line heights
+may have altered. These events happen frequently enough that the cache
+can end up being constantly disabled. With this potentially constant
+invalidation when is the cache ever useful?
+
+ Even if the cache is invalidated before every single usage, it is
+necessary. Scrolling often requires knowledge about display lines which
+are actually above or below the visible region. The cache provides a
+convenient light-weight method of storing this information for multiple
+display regions. This knowledge is necessary for the scrolling code to
+always obey the First Golden Rule of Redisplay.
+
+ If the cache already contains all of the information that the
+scrolling routines happen to need so that it doesn't have to go
+generate it, then we are able to obey the Third Golden Rule of
+Redisplay. The first thing we do to help out the cache is to always
+add the displayed region. This region had to be generated anyway, so
+the cache ends up getting the information basically for free. In those
+cases where a user is simply scrolling around viewing a buffer there is
+a high probability that this is sufficient to always provide the needed
+information. The second thing we can do is be smart about invalidating
+the cache.
+
+ TODO--Be smart about invalidating the cache. Potential places:
+
+ * Insertions at end-of-line which don't cause line-wraps do not
+ alter the starting positions of any display lines. These types of
+ buffer modifications should not invalidate the cache. This is
+ actually a large optimization for redisplay speed as well.
+
+ * Buffer modifications frequently only affect the display of lines
+ at and below where they occur. In these situations we should only
+ invalidate the part of the cache starting at where the
+ modification occurs.
+
+ In case you're wondering, the Second Golden Rule of Redisplay is not
+applicable.
+
+\1f
+File: internals.info, Node: Redisplay Piece by Piece, Prev: Line Start Cache, Up: The Redisplay Mechanism
+
+Redisplay Piece by Piece
+========================
+
+As you can begin to see redisplay is complex and also not well
+documented. Chuck no longer works on XEmacs so this section is my take
+on the workings of redisplay.
+
+ Redisplay happens in three phases:
+
+ 1. Determine desired display in area that needs redisplay.
+ Implemented by `redisplay.c'
+
+ 2. Compare desired display with current display Implemented by
+ `redisplay-output.c'
+
+ 3. Output changes Implemented by `redisplay-output.c',
+ `redisplay-x.c', `redisplay-msw.c' and `redisplay-tty.c'
+
+ Steps 1 and 2 are device-independent and relatively complex. Step 3
+is mostly device-dependent.
+
+ Determining the desired display
+
+ Display attributes are stored in `display_line' structures. Each
+`display_line' consists of a set of `display_block''s and each
+`display_block' contains a number of `rune''s. Generally dynarr's of
+`display_line''s are held by each window representing the current
+display and the desired display.
+
+ The `display_line' structures are tightly tied to buffers which
+presents a problem for redisplay as this connection is bogus for the
+modeline. Hence the `display_line' generation routines are duplicated
+for generating the modeline. This means that the modeline display code
+has many bugs that the standard redisplay code does not.
+
+ The guts of `display_line' generation are in `create_text_block',
+which creates a single display line for the desired locale. This
+incrementally parses the characters on the current line and generates
+redisplay structures for each.
+
+ Gutter redisplay is different. Because the data to display is stored
+in a string we cannot use `create_text_block'. Instead we use
+`create_text_string_block' which performs the same function as
+`create_text_block' but for strings. Many of the complexities of
+`create_text_block' to do with cursor handling and selective display
+have been removed.
+
+\1f
+File: internals.info, Node: Extents, Next: Faces, Prev: The Redisplay Mechanism, Up: Top
+
+Extents
+*******
+
+* Menu:
+
+* Introduction to Extents:: Extents are ranges over text, with properties.
+* Extent Ordering:: How extents are ordered internally.
+* Format of the Extent Info:: The extent information in a buffer or string.
+* Zero-Length Extents:: A weird special case.
+* Mathematics of Extent Ordering:: A rigorous foundation.
+* Extent Fragments:: Cached information useful for redisplay.
+
+\1f
+File: internals.info, Node: Introduction to Extents, Next: Extent Ordering, Up: Extents
+
+Introduction to Extents
+=======================
+
+Extents are regions over a buffer, with a start and an end position
+denoting the region of the buffer included in the extent. In addition,
+either end can be closed or open, meaning that the endpoint is or is
+not logically included in the extent. Insertion of a character at a
+closed endpoint causes the character to go inside the extent; insertion
+at an open endpoint causes the character to go outside.
-Character-Related Data Types
+ Extent endpoints are stored using memory indices (see `insdel.c'),
+to minimize the amount of adjusting that needs to be done when
+characters are inserted or deleted.
+
+ (Formerly, extent endpoints at the gap could be either before or
+after the gap, depending on the open/closedness of the endpoint. The
+intent of this was to make it so that insertions would automatically go
+inside or out of extents as necessary with no further work needing to
+be done. It didn't work out that way, however, and just ended up
+complexifying and buggifying all the rest of the code.)
+
+\1f
+File: internals.info, Node: Extent Ordering, Next: Format of the Extent Info, Prev: Introduction to Extents, Up: Extents
+
+Extent Ordering
+===============
+
+Extents are compared using memory indices. There are two orderings for
+extents and both orders are kept current at all times. The normal or
+"display" order is as follows:
+
+ Extent A is ``less than'' extent B,
+ that is, earlier in the display order,
+ if: A-start < B-start,
+ or if: A-start = B-start, and A-end > B-end
+
+ So if two extents begin at the same position, the larger of them is
+the earlier one in the display order (`EXTENT_LESS' is true).
+
+ For the e-order, the same thing holds:
+
+ Extent A is ``less than'' extent B in e-order,
+ that is, later in the buffer,
+ if: A-end < B-end,
+ or if: A-end = B-end, and A-start > B-start
+
+ So if two extents end at the same position, the smaller of them is
+the earlier one in the e-order (`EXTENT_E_LESS' is true).
+
+ The display order and the e-order are complementary orders: any
+theorem about the display order also applies to the e-order if you swap
+all occurrences of "display order" and "e-order", "less than" and
+"greater than", and "extent start" and "extent end".
+
+\1f
+File: internals.info, Node: Format of the Extent Info, Next: Zero-Length Extents, Prev: Extent Ordering, Up: Extents
+
+Format of the Extent Info
+=========================
+
+An extent-info structure consists of a list of the buffer or string's
+extents and a "stack of extents" that lists all of the extents over a
+particular position. The stack-of-extents info is used for
+optimization purposes--it basically caches some info that might be
+expensive to compute. Certain otherwise hard computations are easy
+given the stack of extents over a particular position, and if the stack
+of extents over a nearby position is known (because it was calculated
+at some prior point in time), it's easy to move the stack of extents to
+the proper position.
+
+ Given that the stack of extents is an optimization, and given that
+it requires memory, a string's stack of extents is wiped out each time
+a garbage collection occurs. Therefore, any time you retrieve the
+stack of extents, it might not be there. If you need it to be there,
+use the `_force' version.
+
+ Similarly, a string may or may not have an extent_info structure.
+(Generally it won't if there haven't been any extents added to the
+string.) So use the `_force' version if you need the extent_info
+structure to be there.
+
+ A list of extents is maintained as a double gap array: one gap array
+is ordered by start index (the "display order") and the other is
+ordered by end index (the "e-order"). Note that positions in an extent
+list should logically be conceived of as referring _to_ a particular
+extent (as is the norm in programs) rather than sitting between two
+extents. Note also that callers of these functions should not be aware
+of the fact that the extent list is implemented as an array, except for
+the fact that positions are integers (this should be generalized to
+handle integers and linked list equally well).
+
+\1f
+File: internals.info, Node: Zero-Length Extents, Next: Mathematics of Extent Ordering, Prev: Format of the Extent Info, Up: Extents
+
+Zero-Length Extents
+===================
+
+Extents can be zero-length, and will end up that way if their endpoints
+are explicitly set that way or if their detachable property is `nil'
+and all the text in the extent is deleted. (The exception is open-open
+zero-length extents, which are barred from existing because there is no
+sensible way to define their properties. Deletion of the text in an
+open-open extent causes it to be converted into a closed-open extent.)
+Zero-length extents are primarily used to represent annotations, and
+behave as follows:
+
+ 1. Insertion at the position of a zero-length extent expands the
+ extent if both endpoints are closed; goes after the extent if it
+ is closed-open; and goes before the extent if it is open-closed.
+
+ 2. Deletion of a character on a side of a zero-length extent whose
+ corresponding endpoint is closed causes the extent to be detached
+ if it is detachable; if the extent is not detachable or the
+ corresponding endpoint is open, the extent remains in the buffer,
+ moving as necessary.
+
+ Note that closed-open, non-detachable zero-length extents behave
+exactly like markers and that open-closed, non-detachable zero-length
+extents behave like the "point-type" marker in Mule.
+
+\1f
+File: internals.info, Node: Mathematics of Extent Ordering, Next: Extent Fragments, Prev: Zero-Length Extents, Up: Extents
+
+Mathematics of Extent Ordering
+==============================
+
+The extents in a buffer are ordered by "display order" because that is
+that order that the redisplay mechanism needs to process them in. The
+e-order is an auxiliary ordering used to facilitate operations over
+extents. The operations that can be performed on the ordered list of
+extents in a buffer are
+
+ 1. Locate where an extent would go if inserted into the list.
+
+ 2. Insert an extent into the list.
+
+ 3. Remove an extent from the list.
+
+ 4. Map over all the extents that overlap a range.
+
+ (4) requires being able to determine the first and last extents that
+overlap a range.
+
+ NOTE: "overlap" is used as follows:
+
+ * two ranges overlap if they have at least one point in common.
+ Whether the endpoints are open or closed makes a difference here.
+
+ * a point overlaps a range if the point is contained within the
+ range; this is equivalent to treating a point P as the range [P,
+ P].
+
+ * In the case of an _extent_ overlapping a point or range, the extent
+ is normally treated as having closed endpoints. This applies
+ consistently in the discussion of stacks of extents and such below.
+ Note that this definition of overlap is not necessarily consistent
+ with the extents that `map-extents' maps over, since `map-extents'
+ sometimes pays attention to whether the endpoints of an extents
+ are open or closed. But for our purposes, it greatly simplifies
+ things to treat all extents as having closed endpoints.
+
+ First, define >, <, <=, etc. as applied to extents to mean
+comparison according to the display order. Comparison between an
+extent E and an index I means comparison between E and the range [I, I].
+
+ Also define e>, e<, e<=, etc. to mean comparison according to the
+e-order.
+
+ For any range R, define R(0) to be the starting index of the range
+and R(1) to be the ending index of the range.
+
+ For any extent E, define E(next) to be the extent directly following
+E, and E(prev) to be the extent directly preceding E. Assume E(next)
+and E(prev) can be determined from E in constant time. (This is
+because we store the extent list as a doubly linked list.)
+
+ Similarly, define E(e-next) and E(e-prev) to be the extents directly
+following and preceding E in the e-order.
+
+ Now:
+
+ Let R be a range. Let F be the first extent overlapping R. Let L
+be the last extent overlapping R.
+
+ Theorem 1: R(1) lies between L and L(next), i.e. L <= R(1) < L(next).
+
+ This follows easily from the definition of display order. The basic
+reason that this theorem applies is that the display order sorts by
+increasing starting index.
+
+ Therefore, we can determine L just by looking at where we would
+insert R(1) into the list, and if we know F and are moving forward over
+extents, we can easily determine when we've hit L by comparing the
+extent we're at to R(1).
+
+ Theorem 2: F(e-prev) e< [1, R(0)] e<= F.
+
+ This is the analog of Theorem 1, and applies because the e-order
+sorts by increasing ending index.
+
+ Therefore, F can be found in the same amount of time as operation
+(1), i.e. the time that it takes to locate where an extent would go if
+inserted into the e-order list.
+
+ If the lists were stored as balanced binary trees, then operation (1)
+would take logarithmic time, which is usually quite fast. However,
+currently they're stored as simple doubly-linked lists, and instead we
+do some caching to try to speed things up.
+
+ Define a "stack of extents" (or "SOE") as the set of extents
+(ordered in the display order) that overlap an index I, together with
+the SOE's "previous" extent, which is an extent that precedes I in the
+e-order. (Hopefully there will not be very many extents between I and
+the previous extent.)
+
+ Now:
+
+ Let I be an index, let S be the stack of extents on I, let F be the
+first extent in S, and let P be S's previous extent.
+
+ Theorem 3: The first extent in S is the first extent that overlaps
+any range [I, J].
+
+ Proof: Any extent that overlaps [I, J] but does not include I must
+have a start index > I, and thus be greater than any extent in S.
+
+ Therefore, finding the first extent that overlaps a range R is the
+same as finding the first extent that overlaps R(0).
+
+ Theorem 4: Let I2 be an index such that I2 > I, and let F2 be the
+first extent that overlaps I2. Then, either F2 is in S or F2 is
+greater than any extent in S.
+
+ Proof: If F2 does not include I then its start index is greater than
+I and thus it is greater than any extent in S, including F. Otherwise,
+F2 includes I and thus is in S, and thus F2 >= F.
+
+\1f
+File: internals.info, Node: Extent Fragments, Prev: Mathematics of Extent Ordering, Up: Extents
+
+Extent Fragments
+================
+
+Imagine that the buffer is divided up into contiguous, non-overlapping
+"runs" of text such that no extent starts or ends within a run (extents
+that abut the run don't count).
+
+ An extent fragment is a structure that holds data about the run that
+contains a particular buffer position (if the buffer position is at the
+junction of two runs, the run after the position is used)--the
+beginning and end of the run, a list of all of the extents in that run,
+the "merged face" that results from merging all of the faces
+corresponding to those extents, the begin and end glyphs at the
+beginning of the run, etc. This is the information that redisplay needs
+in order to display this run.
+
+ Extent fragments have to be very quick to update to a new buffer
+position when moving linearly through the buffer. They rely on the
+stack-of-extents code, which does the heavy-duty algorithmic work of
+determining which extents overly a particular position.
+
+\1f
+File: internals.info, Node: Faces, Next: Glyphs, Prev: Extents, Up: Top
+
+Faces
+*****
+
+Not yet documented.
+
+\1f
+File: internals.info, Node: Glyphs, Next: Specifiers, Prev: Faces, Up: Top
+
+Glyphs
+******
+
+Glyphs are graphical elements that can be displayed in XEmacs buffers or
+gutters. We use the term graphical element here in the broadest possible
+sense since glyphs can be as mundane as text or as arcane as a native
+tab widget.
+
+ In XEmacs, glyphs represent the uninstantiated state of graphical
+elements, i.e. they hold all the information necessary to produce an
+image on-screen but the image need not exist at this stage, and multiple
+screen images can be instantiated from a single glyph.
+
+ Glyphs are lazily instantiated by calling one of the glyph
+functions. This usually occurs within redisplay when `Fglyph_height' is
+called. Instantiation causes an image-instance to be created and
+cached. This cache is on a per-device basis for all glyphs except
+widget-glyphs, and on a per-window basis for widgets-glyphs. The
+caching is done by `image_instantiate' and is necessary because it is
+generally possible to display an image-instance in multiple domains.
+For instance if we create a Pixmap, we can actually display this on
+multiple windows - even though we only need a single Pixmap instance to
+do this. If caching wasn't done then it would be necessary to create
+image-instances for every displayable occurrence of a glyph - and every
+usage - and this would be extremely memory and cpu intensive.
+
+ Widget-glyphs (a.k.a native widgets) are not cached in this way.
+This is because widget-glyph image-instances on screen are toolkit
+windows, and thus cannot be reused in multiple XEmacs domains. Thus
+widget-glyphs are cached on an XEmacs window basis.
+
+ Any action on a glyph first consults the cache before actually
+instantiating a widget.
+
+Glyph Instantiation
+===================
+
+Glyph instantiation is a hairy topic and requires some explanation. The
+guts of glyph instantiation is contained within `image_instantiate'. A
+glyph contains an image which is a specifier. When a glyph function -
+for instance `Fglyph_height' - asks for a property of the glyph that
+can only be determined from its instantiated state, then the glyph
+image is instantiated and an image instance created. The instantiation
+process is governed by the specifier code and goes through a series of
+steps:
+
+ * Validation. Instantiation of image instances happens dynamically -
+ often within the guts of redisplay. Thus it is often not feasible
+ to catch instantiator errors at instantiation time. Instead the
+ instantiator is validated at the time it is added to the image
+ specifier. This function is defined by `image_validate' and at a
+ simple level validates keyword value pairs.
+
+ * Duplication. The specifier code by default takes a copy of the
+ instantiator. This is reasonable for most specifiers but in the
+ case of widget-glyphs can be problematic, since some of the
+ properties in the instantiator - for instance callbacks - could
+ cause infinite recursion in the copying process. Thus the image
+ code defines a function - `image_copy_instantiator' - which will
+ selectively copy values. This is controlled by the way that a
+ keyword is defined either using `IIFORMAT_VALID_KEYWORD' or
+ `IIFORMAT_VALID_NONCOPY_KEYWORD'. Note that the image caching and
+ redisplay code relies on instantiator copying to ensure that
+ current and new instantiators are actually different rather than
+ referring to the same thing.
+
+ * Normalization. Once the instantiator has been copied it must be
+ converted into a form that is viable at instantiation time. This
+ can involve no changes at all, but typically involves things like
+ converting file names to the actual data. This function is defined
+ by `image_going_to_add' and `normalize_image_instantiator'.
+
+ * Instantiation. When an image instance is actually required for
+ display it is instantiated using `image_instantiate'. This
+ involves calling instantiate methods that are specific to the type
+ of image being instantiated.
+
+ The final instantiation phase also involves a number of steps. In
+order to understand these we need to describe a number of concepts.
+
+ An image is instantiated in a "domain", where a domain can be any
+one of a device, frame, window or image-instance. The domain gives the
+image-instance context and identity and properties that affect the
+appearance of the image-instance may be different for the same glyph
+instantiated in different domains. An example is the face used to
+display the image-instance.
+
+ Although an image is instantiated in a particular domain the
+instantiation domain is not necessarily the domain in which the
+image-instance is cached. For example a pixmap can be instantiated in a
+window be actually be cached on a per-device basis. The domain in which
+the image-instance is actually cached is called the "governing-domain".
+A governing-domain is currently either a device or a window.
+Widget-glyphs and text-glyphs have a window as a governing-domain, all
+other image-instances have a device as the governing-domain. The
+governing domain for an image-instance is determined using the
+governing_domain image-instance method.
+
+Widget-Glyphs
+=============
+
+Widget-Glyphs in the MS-Windows Environment
+===========================================
+
+To Do
+
+Widget-Glyphs in the X Environment
+==================================
+
+Widget-glyphs under X make heavy use of lwlib (*note Lucid Widget
+Library::) for manipulating the native toolkit objects. This is
+primarily so that different toolkits can be supported for
+widget-glyphs, just as they are supported for features such as menubars
+etc.
+
+ Lwlib is extremely poorly documented and quite hairy so here is my
+understanding of what goes on.
+
+ Lwlib maintains a set of widget_instances which mirror the
+hierarchical state of Xt widgets. I think this is so that widgets can
+be updated and manipulated generically by the lwlib library. For
+instance update_one_widget_instance can cope with multiple types of
+widget and multiple types of toolkit. Each element in the widget
+hierarchy is updated from its corresponding widget_instance by walking
+the widget_instance tree recursively.
+
+ This has desirable properties such as lw_modify_all_widgets which is
+called from `glyphs-x.c' and updates all the properties of a widget
+without having to know what the widget is or what toolkit it is from.
+Unfortunately this also has hairy properties such as making the lwlib
+code quite complex. And of course lwlib has to know at some level what
+the widget is and how to set its properties.
+
+\1f
+File: internals.info, Node: Specifiers, Next: Menus, Prev: Glyphs, Up: Top
+
+Specifiers
+**********
+
+Not yet documented.
+
+\1f
+File: internals.info, Node: Menus, Next: Subprocesses, Prev: Specifiers, Up: Top
+
+Menus
+*****
+
+A menu is set by setting the value of the variable `current-menubar'
+(which may be buffer-local) and then calling `set-menubar-dirty-flag'
+to signal a change. This will cause the menu to be redrawn at the next
+redisplay. The format of the data in `current-menubar' is described in
+`menubar.c'.
+
+ Internally the data in current-menubar is parsed into a tree of
+`widget_value's' (defined in `lwlib.h'); this is accomplished by the
+recursive function `menu_item_descriptor_to_widget_value()', called by
+`compute_menubar_data()'. Such a tree is deallocated using
+`free_widget_value()'.
+
+ `update_screen_menubars()' is one of the external entry points.
+This checks to see, for each screen, if that screen's menubar needs to
+be updated. This is the case if
+
+ 1. `set-menubar-dirty-flag' was called since the last redisplay.
+ (This function sets the C variable menubar_has_changed.)
+
+ 2. The buffer displayed in the screen has changed.
+
+ 3. The screen has no menubar currently displayed.
+
+ `set_screen_menubar()' is called for each such screen. This
+function calls `compute_menubar_data()' to create the tree of
+widget_value's, then calls `lw_create_widget()',
+`lw_modify_all_widgets()', and/or `lw_destroy_all_widgets()' to create
+the X-Toolkit widget associated with the menu.
+
+ `update_psheets()', the other external entry point, actually changes
+the menus being displayed. It uses the widgets fixed by
+`update_screen_menubars()' and calls various X functions to ensure that
+the menus are displayed properly.
+
+ The menubar widget is set up so that `pre_activate_callback()' is
+called when the menu is first selected (i.e. mouse button goes down),
+and `menubar_selection_callback()' is called when an item is selected.
+`pre_activate_callback()' calls the function in activate-menubar-hook,
+which can change the menubar (this is described in `menubar.c'). If
+the menubar is changed, `set_screen_menubars()' is called.
+`menubar_selection_callback()' enqueues a menu event, putting in it a
+function to call (either `eval' or `call-interactively') and its
+argument, which is the callback function or form given in the menu's
+description.
+
+\1f
+File: internals.info, Node: Subprocesses, Next: Interface to the X Window System, Prev: Menus, Up: Top
+
+Subprocesses
+************
+
+The fields of a process are:
+
+`name'
+ A string, the name of the process.
+
+`command'
+ A list containing the command arguments that were used to start
+ this process.
+
+`filter'
+ A function used to accept output from the process instead of a
+ buffer, or `nil'.
+
+`sentinel'
+ A function called whenever the process receives a signal, or `nil'.
+
+`buffer'
+ The associated buffer of the process.
+
+`pid'
+ An integer, the Unix process ID.
+
+`childp'
+ A flag, non-`nil' if this is really a child process. It is `nil'
+ for a network connection.
+
+`mark'
+ A marker indicating the position of the end of the last output
+ from this process inserted into the buffer. This is often but not
+ always the end of the buffer.
+
+`kill_without_query'
+ If this is non-`nil', killing XEmacs while this process is still
+ running does not ask for confirmation about killing the process.
+
+`raw_status_low'
+`raw_status_high'
+ These two fields record 16 bits each of the process status
+ returned by the `wait' system call.
+
+`status'
+ The process status, as `process-status' should return it.
+
+`tick'
+`update_tick'
+ If these two fields are not equal, a change in the status of the
+ process needs to be reported, either by running the sentinel or by
+ inserting a message in the process buffer.
+
+`pty_flag'
+ Non-`nil' if communication with the subprocess uses a PTY; `nil'
+ if it uses a pipe.
+
+`infd'
+ The file descriptor for input from the process.
+
+`outfd'
+ The file descriptor for output to the process.
+
+`subtty'
+ The file descriptor for the terminal that the subprocess is using.
+ (On some systems, there is no need to record this, so the value is
+ `-1'.)
+
+`tty_name'
+ The name of the terminal that the subprocess is using, or `nil' if
+ it is using pipes.
+
+\1f
+File: internals.info, Node: Interface to the X Window System, Next: Index, Prev: Subprocesses, Up: Top
+
+Interface to the X Window System
+********************************
+
+Mostly undocumented.
+
+* Menu:
+
+* Lucid Widget Library:: An interface to various widget sets.
+
+\1f
+File: internals.info, Node: Lucid Widget Library, Up: Interface to the X Window System
+
+Lucid Widget Library
+====================
+
+Lwlib is extremely poorly documented and quite hairy. The author(s)
+blame that on X, Xt, and Motif, with some justice, but also sufficient
+hypocrisy to avoid drawing the obvious conclusion about their own work.
+
+ The Lucid Widget Library is composed of two more or less independent
+pieces. The first, as the name suggests, is a set of widgets. These
+widgets are intended to resemble and improve on widgets provided in the
+Motif toolkit but not in the Athena widgets, including menubars and
+scrollbars. Recent additions by Andy Piper integrate some "modern"
+widgets by Edward Falk, including checkboxes, radio buttons, progress
+gauges, and index tab controls (aka notebooks).
+
+ The second piece of the Lucid widget library is a generic interface
+to several toolkits for X (including Xt, the Athena widget set, and
+Motif, as well as the Lucid widgets themselves) so that core XEmacs
+code need not know which widget set has been used to build the
+graphical user interface.
+
+* Menu:
+
+* Generic Widget Interface:: The lwlib generic widget interface.
+* Scrollbars::
+* Menubars::
+* Checkboxes and Radio Buttons::
+* Progress Bars::
+* Tab Controls::
+
+\1f
+File: internals.info, Node: Generic Widget Interface, Next: Scrollbars, Up: Lucid Widget Library
+
+Generic Widget Interface
+------------------------
+
+In general in any toolkit a widget may be a composite object. In Xt,
+all widgets have an X window that they manage, but typically a complex
+widget will have widget children, each of which manages a subwindow of
+the parent widget's X window. These children may themselves be
+composite widgets. Thus a widget is actually a tree or hierarchy of
+widgets.
+
+ For each toolkit widget, lwlib maintains a tree of `widget_values'
+which mirror the hierarchical state of Xt widgets (including Motif,
+Athena, 3D Athena, and Falk's widget sets). Each `widget_value' has
+`contents' member, which points to the head of a linked list of its
+children. The linked list of siblings is chained through the `next'
+member of `widget_value'.
+
+ +-----------+
+ | composite |
+ +-----------+
+ |
+ | contents
+ V
+ +-------+ next +-------+ next +-------+
+ | child |----->| child |----->| child |
+ +-------+ +-------+ +-------+
+ |
+ | contents
+ V
+ +-------------+ next +-------------+
+ | grand child |----->| grand child |
+ +-------------+ +-------------+
+
+ The `widget_value' hierarchy of a composite widget with two simple
+ children and one composite child.
+
+ The `widget_instance' structure maintains the inverse view of the
+tree. As for the `widget_value', siblings are chained through the
+`next' member. However, rather than naming children, the
+`widget_instance' tree links to parents.
+
+ +-----------+
+ | composite |
+ +-----------+
+ A
+ | parent
+ |
+ +-------+ next +-------+ next +-------+
+ | child |----->| child |----->| child |
+ +-------+ +-------+ +-------+
+ A
+ | parent
+ |
+ +-------------+ next +-------------+
+ | grand child |----->| grand child |
+ +-------------+ +-------------+
+
+ The `widget_value' hierarchy of a composite widget with two simple
+ children and one composite child.
+
+ This permits widgets derived from different toolkits to be updated
+and manipulated generically by the lwlib library. For instance
+`update_one_widget_instance' can cope with multiple types of widget and
+multiple types of toolkit. Each element in the widget hierarchy is
+updated from its corresponding `widget_value' by walking the
+`widget_value' tree. This has desirable properties. For example,
+`lw_modify_all_widgets' is called from `glyphs-x.c' and updates all the
+properties of a widget without having to know what the widget is or
+what toolkit it is from. Unfortunately this also has its hairy
+properties; the lwlib code quite complex. And of course lwlib has to
+know at some level what the widget is and how to set its properties.
+
+ The `widget_instance' structure also contains a pointer to the root
+of its tree. Widget instances are further confi
+
+\1f
+File: internals.info, Node: Scrollbars, Next: Menubars, Prev: Generic Widget Interface, Up: Lucid Widget Library
+
+Scrollbars
+----------
+
+\1f
+File: internals.info, Node: Menubars, Next: Checkboxes and Radio Buttons, Prev: Scrollbars, Up: Lucid Widget Library
+
+Menubars
+--------
+
+\1f
+File: internals.info, Node: Checkboxes and Radio Buttons, Next: Progress Bars, Prev: Menubars, Up: Lucid Widget Library
+
+Checkboxes and Radio Buttons
----------------------------
- First, let's review the basic character-related datatypes used by
-XEmacs. Note that the separate `typedef's are not mandatory in the
-current implementation (all of them boil down to `unsigned char' or
-`int'), but they improve clarity of code a great deal, because one
-glance at the declaration can tell the intended use of the variable.
-
-`Emchar'
- An `Emchar' holds a single Emacs character.
-
- Obviously, the equality between characters and bytes is lost in
- the Mule world. Characters can be represented by one or more
- bytes in the buffer, and `Emchar' is the C type large enough to
- hold any character.
-
- Without Mule support, an `Emchar' is equivalent to an `unsigned
- char'.
-
-`Bufbyte'
- The data representing the text in a buffer or string is logically
- a set of `Bufbyte's.
-
- XEmacs does not work with the same character formats all the time;
- when reading characters from the outside, it decodes them to an
- internal format, and likewise encodes them when writing.
- `Bufbyte' (in fact `unsigned char') is the basic unit of XEmacs
- internal buffers and strings format. A `Bufbyte *' is the type
- that points at text encoded in the variable-width internal
- encoding.
-
- One character can correspond to one or more `Bufbyte's. In the
- current Mule implementation, an ASCII character is represented by
- the same `Bufbyte', and other characters are represented by a
- sequence of two or more `Bufbyte's.
-
- Without Mule support, there are exactly 256 characters, implicitly
- Latin-1, and each character is represented using one `Bufbyte', and
- there is a one-to-one correspondence between `Bufbyte's and
- `Emchar's.
-
-`Bufpos'
-`Charcount'
- A `Bufpos' represents a character position in a buffer or string.
- A `Charcount' represents a number (count) of characters.
- Logically, subtracting two `Bufpos' values yields a `Charcount'
- value. Although all of these are `typedef'ed to `EMACS_INT', we
- use them in preference to `EMACS_INT' to make it clear what sort
- of position is being used.
-
- `Bufpos' and `Charcount' values are the only ones that are ever
- visible to Lisp.
-
-`Bytind'
-`Bytecount'
- A `Bytind' represents a byte position in a buffer or string. A
- `Bytecount' represents the distance between two positions, in
- bytes. The relationship between `Bytind' and `Bytecount' is the
- same as the relationship between `Bufpos' and `Charcount'.
-
-`Extbyte'
-`Extcount'
- When dealing with the outside world, XEmacs works with `Extbyte's,
- which are equivalent to `unsigned char'. Obviously, an `Extcount'
- is the distance between two `Extbyte's. Extbytes and Extcounts
- are not all that frequent in XEmacs code.
+\1f
+File: internals.info, Node: Progress Bars, Next: Tab Controls, Prev: Checkboxes and Radio Buttons, Up: Lucid Widget Library
+
+Progress Bars
+-------------
+
+\1f
+File: internals.info, Node: Tab Controls, Prev: Progress Bars, Up: Lucid Widget Library
+
+Tab Controls
+------------
+
+\1f
+File: internals.info, Node: Index, Prev: Interface to the X Window System, Up: Top
+
+Index
+*****
+
+* Menu:
+
+* allocation from frob blocks: Allocation from Frob Blocks.
+* allocation of objects in XEmacs Lisp: Allocation of Objects in XEmacs Lisp.
+* allocation, introduction to: Introduction to Allocation.
+* allocation, low-level: Low-level allocation.
+* Amdahl Corporation: XEmacs.
+* Andreessen, Marc: XEmacs.
+* asynchronous subprocesses: Modules for Interfacing with the Operating System.
+* bars, progress: Progress Bars.
+* Baur, Steve: XEmacs.
+* Benson, Eric: Lucid Emacs.
+* binding; the specbinding stack; unwind-protects, dynamic: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* bindings, evaluation; stack frames;: Evaluation; Stack Frames; Bindings.
+* bit vector: Bit Vector.
+* bridge, playing: XEmacs From the Outside.
+* Buchholz, Martin: XEmacs.
+* Bufbyte: Character-Related Data Types.
+* Bufbytes and Emchars: Bufbytes and Emchars.
+* buffer lists: Buffer Lists.
+* buffer object, the: The Buffer Object.
+* buffer, the text in a: The Text in a Buffer.
+* buffers and textual representation: Buffers and Textual Representation.
+* buffers, introduction to: Introduction to Buffers.
+* Bufpos: Character-Related Data Types.
+* building, XEmacs from the perspective of: XEmacs From the Perspective of Building.
+* buttons, checkboxes and radio: Checkboxes and Radio Buttons.
+* byte positions, working with character and: Working With Character and Byte Positions.
+* Bytecount: Character-Related Data Types.
+* bytecount_to_charcount: Working With Character and Byte Positions.
+* Bytind: Character-Related Data Types.
+* C code, rules when writing new: Rules When Writing New C Code.
+* C vs. Lisp: The Lisp Language.
+* callback routines, the event stream: The Event Stream Callback Routines.
+* caller-protects (GCPRO rule): Writing Lisp Primitives.
+* case table: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* catch and throw: Catch and Throw.
+* CCL: CCL.
+* character and byte positions, working with: Working With Character and Byte Positions.
+* character encoding, internal: Internal Character Encoding.
+* character sets: Character Sets.
+* character sets and encodings, Mule: MULE Character Sets and Encodings.
+* character-related data types: Character-Related Data Types.
+* characters, integers and: Integers and Characters.
+* Charcount: Character-Related Data Types.
+* charcount_to_bytecount: Working With Character and Byte Positions.
+* charptr_emchar: Working With Character and Byte Positions.
+* charptr_n_addr: Working With Character and Byte Positions.
+* checkboxes and radio buttons: Checkboxes and Radio Buttons.
+* closer: Lstream Methods.
+* closure: The XEmacs Object System (Abstractly Speaking).
+* code, an example of Mule-aware: An Example of Mule-Aware Code.
+* code, general guidelines for writing Mule-aware: General Guidelines for Writing Mule-Aware Code.
+* code, rules when writing new C: Rules When Writing New C Code.
+* coding conventions: A Reader's Guide to XEmacs Coding Conventions.
+* coding for Mule: Coding for Mule.
+* coding rules, general: General Coding Rules.
+* coding rules, naming: A Reader's Guide to XEmacs Coding Conventions.
+* command builder, dispatching events; the: Dispatching Events; The Command Builder.
+* comments, writing good: Writing Good Comments.
+* Common Lisp: The Lisp Language.
+* compact_string_chars: compact_string_chars.
+* compiled function: Compiled Function.
+* compiler, the Lisp reader and: The Lisp Reader and Compiler.
+* cons: Cons.
+* conservative garbage collection: GCPROing.
+* consoles; devices; frames; windows: Consoles; Devices; Frames; Windows.
+* consoles; devices; frames; windows, introduction to: Introduction to Consoles; Devices; Frames; Windows.
+* control flow modules, editor-level: Editor-Level Control Flow Modules.
+* conversion to and from external data: Conversion to and from External Data.
+* converting events: Converting Events.
+* copy-on-write: General Coding Rules.
+* creating Lisp object types: Techniques for XEmacs Developers.
+* critical redisplay sections: Critical Redisplay Sections.
+* data dumping: Data dumping.
+* data types, character-related: Character-Related Data Types.
+* DEC_CHARPTR: Working With Character and Byte Positions.
+* developers, techniques for XEmacs: Techniques for XEmacs Developers.
+* devices; frames; windows, consoles;: Consoles; Devices; Frames; Windows.
+* devices; frames; windows, introduction to consoles;: Introduction to Consoles; Devices; Frames; Windows.
+* Devin, Matthieu: Lucid Emacs.
+* dispatching events; the command builder: Dispatching Events; The Command Builder.
+* display order of extents: Mathematics of Extent Ordering.
+* display-related Lisp objects, modules for other: Modules for other Display-Related Lisp Objects.
+* displayable Lisp objects, modules for the basic: Modules for the Basic Displayable Lisp Objects.
+* dumping: Dumping.
+* dumping address allocation: Address allocation.
+* dumping and its justification, what is: Dumping.
+* dumping data descriptions: Data descriptions.
+* dumping object inventory: Object inventory.
+* dumping overview: Overview.
+* dumping phase: Dumping phase.
+* dumping, data: Data dumping.
+* dumping, file loading: Reloading phase.
+* dumping, object relocation: Reloading phase.
+* dumping, pointers: Pointers dumping.
+* dumping, putting back the pdump_opaques: Reloading phase.
+* dumping, putting back the pdump_root_objects and pdump_weak_object_chains: Reloading phase.
+* dumping, putting back the pdump_root_struct_ptrs: Reloading phase.
+* dumping, reloading phase: Reloading phase.
+* dumping, remaining issues: Remaining issues.
+* dumping, reorganize the hash tables: Reloading phase.
+* dumping, the header: The header.
+* dynamic array: Low-Level Modules.
+* dynamic binding; the specbinding stack; unwind-protects: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* dynamic scoping: The Lisp Language.
+* dynamic types: The Lisp Language.
+* editing operations, modules for standard: Modules for Standard Editing Operations.
+* Emacs 19, GNU: GNU Emacs 19.
+* Emacs 20, GNU: GNU Emacs 20.
+* Emacs, a history of: A History of Emacs.
+* Emchar: Character-Related Data Types.
+* Emchars, Bufbytes and: Bufbytes and Emchars.
+* encoding, internal character: Internal Character Encoding.
+* encoding, internal string: Internal String Encoding.
+* encodings, internal Mule: Internal Mule Encodings.
+* encodings, Mule: Encodings.
+* encodings, Mule character sets and: MULE Character Sets and Encodings.
+* Energize: Lucid Emacs.
+* Epoch <1>: XEmacs.
+* Epoch: Lucid Emacs.
+* error checking: Techniques for XEmacs Developers.
+* EUC (Extended Unix Code), Japanese: Japanese EUC (Extended Unix Code).
+* evaluation: Evaluation.
+* evaluation; stack frames; bindings: Evaluation; Stack Frames; Bindings.
+* event gathering mechanism, specifics of the: Specifics of the Event Gathering Mechanism.
+* event loop functions, other: Other Event Loop Functions.
+* event loop, events and the: Events and the Event Loop.
+* event stream callback routines, the: The Event Stream Callback Routines.
+* event, specifics about the Lisp object: Specifics About the Emacs Event.
+* events and the event loop: Events and the Event Loop.
+* events, converting: Converting Events.
+* events, introduction to: Introduction to Events.
+* events, main loop: Main Loop.
+* events; the command builder, dispatching: Dispatching Events; The Command Builder.
+* Extbyte: Character-Related Data Types.
+* Extcount: Character-Related Data Types.
+* Extended Unix Code, Japanese EUC: Japanese EUC (Extended Unix Code).
+* extent fragments: Extent Fragments.
+* extent info, format of the: Format of the Extent Info.
+* extent mathematics: Mathematics of Extent Ordering.
+* extent ordering <1>: Mathematics of Extent Ordering.
+* extent ordering: Extent Ordering.
+* extents: Extents.
+* extents, display order: Mathematics of Extent Ordering.
+* extents, introduction to: Introduction to Extents.
+* extents, markers and: Markers and Extents.
+* extents, zero-length: Zero-Length Extents.
+* external data, conversion to and from: Conversion to and from External Data.
+* external widget: Modules for Interfacing with X Windows.
+* faces: Faces.
+* file system, modules for interfacing with the: Modules for Interfacing with the File System.
+* flusher: Lstream Methods.
+* fragments, extent: Extent Fragments.
+* frames; windows, consoles; devices;: Consoles; Devices; Frames; Windows.
+* frames; windows, introduction to consoles; devices;: Introduction to Consoles; Devices; Frames; Windows.
+* Free Software Foundation: A History of Emacs.
+* frob blocks, allocation from: Allocation from Frob Blocks.
+* FSF: A History of Emacs.
+* FSF Emacs <1>: GNU Emacs 20.
+* FSF Emacs: GNU Emacs 19.
+* function, compiled: Compiled Function.
+* garbage collection: Garbage Collection.
+* garbage collection - step by step: Garbage Collection - Step by Step.
+* garbage collection protection <1>: GCPROing.
+* garbage collection protection: Writing Lisp Primitives.
+* garbage collection, conservative: GCPROing.
+* garbage collection, invocation: Invocation.
+* garbage_collect_1: garbage_collect_1.
+* gc_sweep: gc_sweep.
+* GCPROing: GCPROing.
+* global Lisp variables, adding: Adding Global Lisp Variables.
+* glyph instantiation: Glyphs.
+* glyphs: Glyphs.
+* GNU Emacs 19: GNU Emacs 19.
+* GNU Emacs 20: GNU Emacs 20.
+* Gosling, James <1>: The Lisp Language.
+* Gosling, James: Through Version 18.
+* Great Usenet Renaming: Through Version 18.
+* Hackers (Steven Levy): A History of Emacs.
+* header files, inline functions: Techniques for XEmacs Developers.
+* hierarchy of windows: Window Hierarchy.
+* history of Emacs, a: A History of Emacs.
+* Illinois, University of: XEmacs.
+* INC_CHARPTR: Working With Character and Byte Positions.
+* inline functions: Techniques for XEmacs Developers.
+* inline functions, headers: Techniques for XEmacs Developers.
+* inside, XEmacs from the: XEmacs From the Inside.
+* instantiation, glyph: Glyphs.
+* integers and characters: Integers and Characters.
+* interactive: Modules for Standard Editing Operations.
+* interfacing with the file system, modules for: Modules for Interfacing with the File System.
+* interfacing with the operating system, modules for: Modules for Interfacing with the Operating System.
+* interfacing with X Windows, modules for: Modules for Interfacing with X Windows.
+* internal character encoding: Internal Character Encoding.
+* internal Mule encodings: Internal Mule Encodings.
+* internal string encoding: Internal String Encoding.
+* internationalization, modules for: Modules for Internationalization.
+* interning: The XEmacs Object System (Abstractly Speaking).
+* interpreter and object system, modules for other aspects of the Lisp: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* ITS (Incompatible Timesharing System): A History of Emacs.
+* Japanese EUC (Extended Unix Code): Japanese EUC (Extended Unix Code).
+* Java: The Lisp Language.
+* Java vs. Lisp: The Lisp Language.
+* JIS7: JIS7.
+* Jones, Kyle: XEmacs.
+* Kaplan, Simon: XEmacs.
+* Levy, Steven: A History of Emacs.
+* library, Lucid Widget: Lucid Widget Library.
+* line start cache: Line Start Cache.
+* Lisp interpreter and object system, modules for other aspects of the: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* Lisp language, the: The Lisp Language.
+* Lisp modules, basic: Basic Lisp Modules.
+* Lisp object types, creating: Techniques for XEmacs Developers.
+* Lisp objects are represented in C, how: How Lisp Objects Are Represented in C.
+* Lisp objects, allocation of in XEmacs: Allocation of Objects in XEmacs Lisp.
+* Lisp objects, modules for other display-related: Modules for other Display-Related Lisp Objects.
+* Lisp objects, modules for the basic displayable: Modules for the Basic Displayable Lisp Objects.
+* Lisp primitives, writing: Writing Lisp Primitives.
+* Lisp reader and compiler, the: The Lisp Reader and Compiler.
+* Lisp vs. C: The Lisp Language.
+* Lisp vs. Java: The Lisp Language.
+* low-level allocation: Low-level allocation.
+* low-level modules: Low-Level Modules.
+* lrecords: lrecords.
+* lstream: Modules for Interfacing with the File System.
+* lstream functions: Lstream Functions.
+* lstream methods: Lstream Methods.
+* lstream types: Lstream Types.
+* lstream, creating an: Creating an Lstream.
+* Lstream_close: Lstream Functions.
+* Lstream_fgetc: Lstream Functions.
+* Lstream_flush: Lstream Functions.
+* Lstream_fputc: Lstream Functions.
+* Lstream_fungetc: Lstream Functions.
+* Lstream_getc: Lstream Functions.
+* Lstream_new: Lstream Functions.
+* Lstream_putc: Lstream Functions.
+* Lstream_read: Lstream Functions.
+* Lstream_reopen: Lstream Functions.
+* Lstream_rewind: Lstream Functions.
+* Lstream_set_buffering: Lstream Functions.
+* Lstream_ungetc: Lstream Functions.
+* Lstream_unread: Lstream Functions.
+* Lstream_write: Lstream Functions.
+* lstreams: Lstreams.
+* Lucid Emacs: Lucid Emacs.
+* Lucid Inc.: Lucid Emacs.
+* Lucid Widget Library: Lucid Widget Library.
+* macro hygiene: Techniques for XEmacs Developers.
+* main loop: Main Loop.
+* mark and sweep: Garbage Collection.
+* mark method <1>: lrecords.
+* mark method: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* mark_object: mark_object.
+* marker <1>: Lstream Methods.
+* marker: Marker.
+* markers and extents: Markers and Extents.
+* mathematics of extent ordering: Mathematics of Extent Ordering.
+* MAX_EMCHAR_LEN: Working With Character and Byte Positions.
+* menubars: Menubars.
+* menus: Menus.
+* merging attempts: XEmacs.
+* MIT: A History of Emacs.
+* Mlynarik, Richard: GNU Emacs 19.
+* modules for interfacing with the file system: Modules for Interfacing with the File System.
+* modules for interfacing with the operating system: Modules for Interfacing with the Operating System.
+* modules for interfacing with X Windows: Modules for Interfacing with X Windows.
+* modules for internationalization: Modules for Internationalization.
+* modules for other aspects of the Lisp interpreter and object system: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* modules for other display-related Lisp objects: Modules for other Display-Related Lisp Objects.
+* modules for regression testing: Modules for Regression Testing.
+* modules for standard editing operations: Modules for Standard Editing Operations.
+* modules for the basic displayable Lisp objects: Modules for the Basic Displayable Lisp Objects.
+* modules for the redisplay mechanism: Modules for the Redisplay Mechanism.
+* modules, a summary of the various XEmacs: A Summary of the Various XEmacs Modules.
+* modules, basic Lisp: Basic Lisp Modules.
+* modules, editor-level control flow: Editor-Level Control Flow Modules.
+* modules, low-level: Low-Level Modules.
+* MS-Windows environment, widget-glyphs in the: Glyphs.
+* Mule character sets and encodings: MULE Character Sets and Encodings.
+* Mule encodings: Encodings.
+* Mule encodings, internal: Internal Mule Encodings.
+* MULE merged XEmacs appears: XEmacs.
+* Mule, coding for: Coding for Mule.
+* Mule-aware code, an example of: An Example of Mule-Aware Code.
+* Mule-aware code, general guidelines for writing: General Guidelines for Writing Mule-Aware Code.
+* NAS: Modules for Interfacing with the Operating System.
+* native sound: Modules for Interfacing with the Operating System.
+* network connections: Modules for Interfacing with the Operating System.
+* network sound: Modules for Interfacing with the Operating System.
+* Niksic, Hrvoje: XEmacs.
+* obarrays: Obarrays.
+* object system (abstractly speaking), the XEmacs: The XEmacs Object System (Abstractly Speaking).
+* object system, modules for other aspects of the Lisp interpreter and: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* object types, creating Lisp: Techniques for XEmacs Developers.
+* object, the buffer: The Buffer Object.
+* object, the window: The Window Object.
+* objects are represented in C, how Lisp: How Lisp Objects Are Represented in C.
+* objects in XEmacs Lisp, allocation of: Allocation of Objects in XEmacs Lisp.
+* objects, modules for the basic displayable Lisp: Modules for the Basic Displayable Lisp Objects.
+* operating system, modules for interfacing with the: Modules for Interfacing with the Operating System.
+* outside, XEmacs from the: XEmacs From the Outside.
+* pane: Modules for the Basic Displayable Lisp Objects.
+* permanent objects: The XEmacs Object System (Abstractly Speaking).
+* pi, calculating: XEmacs From the Outside.
+* point: Point.
+* pointers dumping: Pointers dumping.
+* positions, working with character and byte: Working With Character and Byte Positions.
+* primitives, writing Lisp: Writing Lisp Primitives.
+* progress bars: Progress Bars.
+* protection, garbage collection: GCPROing.
+* pseudo_closer: Lstream Methods.
+* Purify: Techniques for XEmacs Developers.
+* Quantify: Techniques for XEmacs Developers.
+* radio buttons, checkboxes and: Checkboxes and Radio Buttons.
+* read syntax: The XEmacs Object System (Abstractly Speaking).
+* read-eval-print: XEmacs From the Outside.
+* reader: Lstream Methods.
+* reader and compiler, the Lisp: The Lisp Reader and Compiler.
+* reader's guide: A Reader's Guide to XEmacs Coding Conventions.
+* redisplay mechanism, modules for the: Modules for the Redisplay Mechanism.
+* redisplay mechanism, the: The Redisplay Mechanism.
+* redisplay piece by piece: Redisplay Piece by Piece.
+* redisplay sections, critical: Critical Redisplay Sections.
+* regression testing, modules for: Modules for Regression Testing.
+* reloading phase: Reloading phase.
+* relocating allocator: Low-Level Modules.
+* rename to XEmacs: XEmacs.
+* represented in C, how Lisp objects are: How Lisp Objects Are Represented in C.
+* rewinder: Lstream Methods.
+* RMS: A History of Emacs.
+* scanner: Modules for Other Aspects of the Lisp Interpreter and Object System.
+* scoping, dynamic: The Lisp Language.
+* scrollbars: Scrollbars.
+* seekable_p: Lstream Methods.
+* selections: Modules for Interfacing with X Windows.
+* set_charptr_emchar: Working With Character and Byte Positions.
+* Sexton, Harlan: Lucid Emacs.
+* sound, native: Modules for Interfacing with the Operating System.
+* sound, network: Modules for Interfacing with the Operating System.
+* SPARCWorks: XEmacs.
+* specbinding stack; unwind-protects, dynamic binding; the: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* special forms, simple: Simple Special Forms.
+* specifiers: Specifiers.
+* stack frames; bindings, evaluation;: Evaluation; Stack Frames; Bindings.
+* Stallman, Richard: A History of Emacs.
+* string: String.
+* string encoding, internal: Internal String Encoding.
+* subprocesses: Subprocesses.
+* subprocesses, asynchronous: Modules for Interfacing with the Operating System.
+* subprocesses, synchronous: Modules for Interfacing with the Operating System.
+* Sun Microsystems: XEmacs.
+* sweep_bit_vectors_1: sweep_bit_vectors_1.
+* sweep_lcrecords_1: sweep_lcrecords_1.
+* sweep_strings: sweep_strings.
+* symbol: Symbol.
+* symbol values: Symbol Values.
+* symbols and variables: Symbols and Variables.
+* symbols, introduction to: Introduction to Symbols.
+* synchronous subprocesses: Modules for Interfacing with the Operating System.
+* tab controls: Tab Controls.
+* taxes, doing: XEmacs From the Outside.
+* techniques for XEmacs developers: Techniques for XEmacs Developers.
+* TECO: A History of Emacs.
+* temporary objects: The XEmacs Object System (Abstractly Speaking).
+* testing, regression: Regression Testing XEmacs.
+* text in a buffer, the: The Text in a Buffer.
+* textual representation, buffers and: Buffers and Textual Representation.
+* Thompson, Chuck: XEmacs.
+* throw, catch and: Catch and Throw.
+* types, dynamic: The Lisp Language.
+* types, lstream: Lstream Types.
+* types, proper use of unsigned: Proper Use of Unsigned Types.
+* University of Illinois: XEmacs.
+* unsigned types, proper use of: Proper Use of Unsigned Types.
+* unwind-protects, dynamic binding; the specbinding stack;: Dynamic Binding; The specbinding Stack; Unwind-Protects.
+* values, symbol: Symbol Values.
+* variables, adding global Lisp: Adding Global Lisp Variables.
+* variables, symbols and: Symbols and Variables.
+* vector: Vector.
+* vector, bit: Bit Vector.
+* version 18, through: Through Version 18.
+* version 19, GNU Emacs: GNU Emacs 19.
+* version 20, GNU Emacs: GNU Emacs 20.
+* widget interface, generic: Generic Widget Interface.
+* widget library, Lucid: Lucid Widget Library.
+* widget-glyphs: Glyphs.
+* widget-glyphs in the MS-Windows environment: Glyphs.
+* widget-glyphs in the X environment: Glyphs.
+* Win-Emacs: XEmacs.
+* window (in Emacs): Modules for the Basic Displayable Lisp Objects.
+* window hierarchy: Window Hierarchy.
+* window object, the: The Window Object.
+* window point internals: The Window Object.
+* windows, consoles; devices; frames;: Consoles; Devices; Frames; Windows.
+* windows, introduction to consoles; devices; frames;: Introduction to Consoles; Devices; Frames; Windows.
+* Wing, Ben: XEmacs.
+* writer: Lstream Methods.
+* writing good comments: Writing Good Comments.
+* writing Lisp primitives: Writing Lisp Primitives.
+* writing Mule-aware code, general guidelines for: General Guidelines for Writing Mule-Aware Code.
+* writing new C code, rules when: Rules When Writing New C Code.
+* X environment, widget-glyphs in the: Glyphs.
+* X Window System, interface to the: Interface to the X Window System.
+* X Windows, modules for interfacing with: Modules for Interfacing with X Windows.
+* XEmacs: XEmacs.
+* XEmacs from the inside: XEmacs From the Inside.
+* XEmacs from the outside: XEmacs From the Outside.
+* XEmacs from the perspective of building: XEmacs From the Perspective of Building.
+* XEmacs goes it alone: XEmacs.
+* XEmacs object system (abstractly speaking), the: The XEmacs Object System (Abstractly Speaking).
+* Zawinski, Jamie: Lucid Emacs.
+* zero-length extents: Zero-Length Extents.
+