This is ../info/internals.info, produced by makeinfo version 4.0 from
internals/internals.texi.

INFO-DIR-SECTION XEmacs Editor
START-INFO-DIR-ENTRY
* Internals: (internals).       XEmacs Internals Manual.
END-INFO-DIR-ENTRY

   Copyright (C) 1992 - 1996 Ben Wing.  Copyright (C) 1996, 1997 Sun
Microsystems.  Copyright (C) 1994 - 1998 Free Software Foundation.
Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.

   Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.

   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that the
entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

   Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation approved by the Foundation.

   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided also
that the section entitled "GNU General Public License" is included
exactly as in the original, and provided that the entire resulting
derived work is distributed under the terms of a permission notice
identical to this one.

   Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that the section entitled "GNU General Public License"
may be included in a translation approved by the Free Software
Foundation instead of in the original English.


File: internals.info,  Node: Top,  Next: A History of Emacs,  Prev: (dir),  Up: (dir)

   This Info file contains v1.0 of the XEmacs Internals Manual.

* Menu:

* A History of Emacs::          Times, dates, important events.
* XEmacs From the Outside::     A broad conceptual overview.
* The Lisp Language::           An overview.
* XEmacs From the Perspective of Building::
* XEmacs From the Inside::
* The XEmacs Object System (Abstractly Speaking)::
* How Lisp Objects Are Represented in C::
* Rules When Writing New C Code::
* A Summary of the Various XEmacs Modules::
* Allocation of Objects in XEmacs Lisp::
* Dumping::
* Events and the Event Loop::
* Evaluation; Stack Frames; Bindings::
* Symbols and Variables::
* Buffers and Textual Representation::
* MULE Character Sets and Encodings::
* The Lisp Reader and Compiler::
* Lstreams::
* Consoles; Devices; Frames; Windows::
* The Redisplay Mechanism::
* Extents::
* Faces::
* Glyphs::
* Specifiers::
* Menus::
* Subprocesses::
* Interface to the X Window System::
* Index::


--- The Detailed Node Listing ---

A History of Emacs

* Through Version 18::          Unification prevails.
* Lucid Emacs::                 One version 19 Emacs.
* GNU Emacs 19::                The other version 19 Emacs.
* GNU Emacs 20::                The other version 20 Emacs.
* XEmacs::                      The continuation of Lucid Emacs.

Rules When Writing New C Code

* General Coding Rules::
* Writing Lisp Primitives::
* Adding Global Lisp Variables::
* Coding for Mule::
* Techniques for XEmacs Developers::

Coding for Mule

* Character-Related Data Types::
* Working With Character and Byte Positions::
* Conversion to and from External Data::
* General Guidelines for Writing Mule-Aware Code::
* An Example of Mule-Aware Code::

A Summary of the Various XEmacs Modules

* Low-Level Modules::
* Basic Lisp Modules::
* Modules for Standard Editing Operations::
* Editor-Level Control Flow Modules::
* Modules for the Basic Displayable Lisp Objects::
* Modules for other Display-Related Lisp Objects::
* Modules for the Redisplay Mechanism::
* Modules for Interfacing with the File System::
* Modules for Other Aspects of the Lisp Interpreter and Object System::
* Modules for Interfacing with the Operating System::
* Modules for Interfacing with X Windows::
* Modules for Internationalization::

Allocation of Objects in XEmacs Lisp

* Introduction to Allocation::
* Garbage Collection::
* GCPROing::
* Garbage Collection - Step by Step::
* Integers and Characters::
* Allocation from Frob Blocks::
* lrecords::
* Low-level allocation::
* Cons::
* Vector::
* Bit Vector::
* Symbol::
* Marker::
* String::
* Compiled Function::

Garbage Collection - Step by Step

* Invocation::
* garbage_collect_1::
* mark_object::
* gc_sweep::
* sweep_lcrecords_1::
* compact_string_chars::
* sweep_strings::
* sweep_bit_vectors_1::

Dumping

* Overview::
* Data descriptions::
* Dumping phase::
* Reloading phase::

Dumping phase

* Object inventory::
* Address allocation::
* The header::
* Data dumping::
* Pointers dumping::

Events and the Event Loop

* Introduction to Events::
* Main Loop::
* Specifics of the Event Gathering Mechanism::
* Specifics About the Emacs Event::
* The Event Stream Callback Routines::
* Other Event Loop Functions::
* Converting Events::
* Dispatching Events; The Command Builder::

Evaluation; Stack Frames; Bindings

* Evaluation::
* Dynamic Binding; The specbinding Stack; Unwind-Protects::
* Simple Special Forms::
* Catch and Throw::

Symbols and Variables

* Introduction to Symbols::
* Obarrays::
* Symbol Values::

Buffers and Textual Representation

* Introduction to Buffers::     A buffer holds a block of text such as a file.
* The Text in a Buffer::        Representation of the text in a buffer.
* Buffer Lists::                Keeping track of all buffers.
* Markers and Extents::         Tagging locations within a buffer.
* Bufbytes and Emchars::        Representation of individual characters.
* The Buffer Object::           The Lisp object corresponding to a buffer.

MULE Character Sets and Encodings

* Character Sets::
* Encodings::
* Internal Mule Encodings::
* CCL::

Encodings

* Japanese EUC (Extended Unix Code)::
* JIS7::

Internal Mule Encodings

* Internal String Encoding::
* Internal Character Encoding::

Lstreams

* Creating an Lstream::         Creating an lstream object.
* Lstream Types::               Different sorts of things that are streamed.
* Lstream Functions::           Functions for working with lstreams.
* Lstream Methods::             Creating new lstream types.

Consoles; Devices; Frames; Windows

* Introduction to Consoles; Devices; Frames; Windows::
* Point::
* Window Hierarchy::
* The Window Object::

The Redisplay Mechanism

* Critical Redisplay Sections::
* Line Start Cache::
* Redisplay Piece by Piece::

Extents

* Introduction to Extents::     Extents are ranges over text, with properties.
* Extent Ordering::             How extents are ordered internally.
* Format of the Extent Info::   The extent information in a buffer or string.
* Zero-Length Extents::         A weird special case.
* Mathematics of Extent Ordering::  A rigorous foundation.
* Extent Fragments::            Cached information useful for redisplay.


File: internals.info,  Node: A History of Emacs,  Next: XEmacs From the Outside,  Prev: Top,  Up: Top

A History of Emacs
******************

   XEmacs is a powerful, customizable text editor and development
environment.  It began as Lucid Emacs, which was in turn derived from
GNU Emacs, a program written by Richard Stallman of the Free Software
Foundation.  GNU Emacs dates back to the 1970's, and was modelled after
a package called "Emacs", written in 1976, that was a set of macros on
top of TECO, an old, old text editor written at MIT on the DEC PDP 10
under one of the earliest time-sharing operating systems, ITS
(Incompatible Timesharing System). (ITS dates back well before Unix.)
ITS, TECO, and Emacs were products of a group of people at MIT who
called themselves "hackers", who shared an idealistic belief system
about the free exchange of information and were fanatical in their
devotion to and time spent with computers. (The hacker subculture dates
back to the late 1950's at MIT and is described in detail in Steven
Levy's book `Hackers'.  This book also includes a lot of information
about Stallman himself and the development of Lisp, a programming
language developed at MIT that underlies Emacs.)

* Menu:

* Through Version 18::          Unification prevails.
* Lucid Emacs::                 One version 19 Emacs.
* GNU Emacs 19::                The other version 19 Emacs.
* GNU Emacs 20::                The other version 20 Emacs.
* XEmacs::                      The continuation of Lucid Emacs.


File: internals.info,  Node: Through Version 18,  Next: Lucid Emacs,  Prev: A History of Emacs,  Up: A History of Emacs

Through Version 18
==================

   Although the history of the early versions of GNU Emacs is unclear,
the history is well-known from the middle of 1985.  A time line is:

   * GNU Emacs version 15 (15.34) was released sometime in 1984 or 1985
     and shared some code with a version of Emacs written by James
     Gosling (the same James Gosling who later created the Java
     language).

   * GNU Emacs version 16 (first released version was 16.56) was
     released on July 15, 1985.  All Gosling code was removed due to
     potential copyright problems with the code.

   * version 16.57: released on September 16, 1985.

   * versions 16.58, 16.59: released on September 17, 1985.

   * version 16.60: released on September 19, 1985.  These later
     version 16's incorporated patches from the net, esp. for getting
     Emacs to work under System V.

   * version 17.36 (first official v17 release) released on December 20,
     1985.  Included a TeX-able user manual.  First official unpatched
     version that worked on vanilla System V machines.

   * version 17.43 (second official v17 release) released on January 25,
     1986.

   * version 17.45 released on January 30, 1986.

   * version 17.46 released on February 4, 1986.

   * version 17.48 released on February 10, 1986.

   * version 17.49 released on February 12, 1986.

   * version 17.55 released on March 18, 1986.

   * version 17.57 released on March 27, 1986.

   * version 17.58 released on April 4, 1986.

   * version 17.61 released on April 12, 1986.

   * version 17.63 released on May 7, 1986.

   * version 17.64 released on May 12, 1986.

   * version 18.24 (a beta version) released on October 2, 1986.

   * version 18.30 (a beta version) released on November 15, 1986.

   * version 18.31 (a beta version) released on November 23, 1986.

   * version 18.32 (a beta version) released on December 7, 1986.

   * version 18.33 (a beta version) released on December 12, 1986.

   * version 18.35 (a beta version) released on January 5, 1987.

   * version 18.36 (a beta version) released on January 21, 1987.

   * January 27, 1987: The Great Usenet Renaming.  net.emacs is now
     comp.emacs.

   * version 18.37 (a beta version) released on February 12, 1987.

   * version 18.38 (a beta version) released on March 3, 1987.

   * version 18.39 (a beta version) released on March 14, 1987.

   * version 18.40 (a beta version) released on March 18, 1987.

   * version 18.41 (the first "official" release) released on March 22,
     1987.

   * version 18.45 released on June 2, 1987.

   * version 18.46 released on June 9, 1987.

   * version 18.47 released on June 18, 1987.

   * version 18.48 released on September 3, 1987.

   * version 18.49 released on September 18, 1987.

   * version 18.50 released on February 13, 1988.

   * version 18.51 released on May 7, 1988.

   * version 18.52 released on September 1, 1988.

   * version 18.53 released on February 24, 1989.

   * version 18.54 released on April 26, 1989.

   * version 18.55 released on August 23, 1989.  This is the earliest
     version that is still available by FTP.

   * version 18.56 released on January 17, 1991.

   * version 18.57 released late January, 1991.

   * version 18.58 released ?????.

   * version 18.59 released October 31, 1992.


File: internals.info,  Node: Lucid Emacs,  Next: GNU Emacs 19,  Prev: Through Version 18,  Up: A History of Emacs

Lucid Emacs
===========

   Lucid Emacs was developed by the (now-defunct) Lucid Inc., a maker of
C++ and Lisp development environments.  It began when Lucid decided they
wanted to use Emacs as the editor and cornerstone of their C++
development environment (called "Energize").  They needed many features
that were not available in the existing version of GNU Emacs (version
18.5something), in particular good and integrated support for GUI
elements such as mouse support, multiple fonts, multiple window-system
windows, etc.  A branch of GNU Emacs called Epoch, written at the
University of Illinois, existed that supplied many of these features;
however, Lucid needed more than what existed in Epoch.  At the time, the
Free Software Foundation was working on version 19 of Emacs (this was
sometime around 1991), which was planned to have similar features, and
so Lucid decided to work with the Free Software Foundation.  Their plan
was to add features that they needed, and coordinate with the FSF so
that the features would get included back into Emacs version 19.

   Delays in the release of version 19 occurred, however (resulting in
it finally being released more than a year after what was initially
planned), and Lucid encountered unexpected technical resistance in
getting their changes merged back into version 19, so they decided to
release their own version of Emacs, which became Lucid Emacs 19.0.

   The initial authors of Lucid Emacs were Matthieu Devin, Harlan
Sexton, and Eric Benson, and the work was later taken over by Jamie
Zawinski, who became "Mr. Lucid Emacs" for many releases.

   A time line for Lucid Emacs/XEmacs is

   * version 19.0 shipped with Energize 1.0, April 1992.

   * version 19.1 released June 4, 1992.

   * version 19.2 released June 19, 1992.

   * version 19.3 released September 9, 1992.

   * version 19.4 released January 21, 1993.

   * version 19.5 was a repackaging of 19.4 with a few bug fixes and
     shipped with Energize 2.0.  Never released to the net.

   * version 19.6 released April 9, 1993.

   * version 19.7 was a repackaging of 19.6 with a few bug fixes and
     shipped with Energize 2.1.  Never released to the net.

   * version 19.8 released September 6, 1993.

   * version 19.9 released January 12, 1994.

   * version 19.10 released May 27, 1994.

   * version 19.11 (first XEmacs) released September 13, 1994.

   * version 19.12 released June 23, 1995.

   * version 19.13 released September 1, 1995.

   * version 19.14 released June 23, 1996.

   * version 20.0 released February 9, 1997.

   * version 19.15 released March 28, 1997.

   * version 20.1 (not released to the net) April 15, 1997.

   * version 20.2 released May 16, 1997.

   * version 19.16 released October 31, 1997.

   * version 20.3 (the first stable version of XEmacs 20.x) released
     November 30, 1997.  version 20.4 released February 28, 1998.


File: internals.info,  Node: GNU Emacs 19,  Next: GNU Emacs 20,  Prev: Lucid Emacs,  Up: A History of Emacs

GNU Emacs 19
============

   About a year after the initial release of Lucid Emacs, the FSF
released a beta of their version of Emacs 19 (referred to here as "GNU
Emacs").  By this time, the current version of Lucid Emacs was 19.6.
(Strangely, the first released beta from the FSF was GNU Emacs 19.7.) A
time line for GNU Emacs version 19 is

   * version 19.8 (beta) released May 27, 1993.

   * version 19.9 (beta) released May 27, 1993.

   * version 19.10 (beta) released May 30, 1993.

   * version 19.11 (beta) released June 1, 1993.

   * version 19.12 (beta) released June 2, 1993.

   * version 19.13 (beta) released June 8, 1993.

   * version 19.14 (beta) released June 17, 1993.

   * version 19.15 (beta) released June 19, 1993.

   * version 19.16 (beta) released July 6, 1993.

   * version 19.17 (beta) released late July, 1993.

   * version 19.18 (beta) released August 9, 1993.

   * version 19.19 (beta) released August 15, 1993.

   * version 19.20 (beta) released November 17, 1993.

   * version 19.21 (beta) released November 17, 1993.

   * version 19.22 (beta) released November 28, 1993.

   * version 19.23 (beta) released May 17, 1994.

   * version 19.24 (beta) released May 16, 1994.

   * version 19.25 (beta) released June 3, 1994.

   * version 19.26 (beta) released September 11, 1994.

   * version 19.27 (beta) released September 14, 1994.

   * version 19.28 (first "official" release) released November 1, 1994.

   * version 19.29 released June 21, 1995.

   * version 19.30 released November 24, 1995.

   * version 19.31 released May 25, 1996.

   * version 19.32 released July 31, 1996.

   * version 19.33 released August 11, 1996.

   * version 19.34 released August 21, 1996.

   * version 19.34b released September 6, 1996.

   In some ways, GNU Emacs 19 was better than Lucid Emacs; in some ways,
worse.  Lucid soon began incorporating features from GNU Emacs 19 into
Lucid Emacs; the work was mostly done by Richard Mlynarik, who had been
working on and using GNU Emacs for a long time (back as far as version
16 or 17).


File: internals.info,  Node: GNU Emacs 20,  Next: XEmacs,  Prev: GNU Emacs 19,  Up: A History of Emacs

GNU Emacs 20
============

   On February 2, 1997 work began on GNU Emacs to integrate Mule.  The
first release was made in September of that year.

   A timeline for Emacs 20 is

   * version 20.1 released September 17, 1997.

   * version 20.2 released September 20, 1997.

   * version 20.3 released August 19, 1998.


File: internals.info,  Node: XEmacs,  Prev: GNU Emacs 20,  Up: A History of Emacs

XEmacs
======

   Around the time that Lucid was developing Energize, Sun Microsystems
was developing their own development environment (called "SPARCWorks")
and also decided to use Emacs.  They joined forces with the Epoch team
at the University of Illinois and later with Lucid.  The maintainer of
the last-released version of Epoch was Marc Andreessen, but he dropped
out and the Epoch project, headed by Simon Kaplan, lured Chuck Thompson
away from a system administration job to become the primary Lucid Emacs
author for Epoch and Sun.  Chuck's area of specialty became the
redisplay engine (he replaced the old Lucid Emacs redisplay engine with
a ported version from Epoch and then later rewrote it from scratch).
Sun also hired Ben Wing (the author of Win-Emacs, a port of Lucid Emacs
to Microsoft Windows 3.1) in 1993, for what was initially a one-month
contract to fix some event problems but later became a many-year
involvement, punctuated by a six-month contract with Amdahl Corporation.

   In 1994, Sun and Lucid agreed to rename Lucid Emacs to XEmacs (a name
not favorable to either company); the first release called XEmacs was
version 19.11.  In June 1994, Lucid folded and Jamie quit to work for
the newly formed Mosaic Communications Corp., later Netscape
Communications Corp. (co-founded by the same Marc Andreessen, who had
quit his Epoch job to work on a graphical browser for the World Wide
Web).  Chuck then become the primary maintainer of XEmacs, and put out
versions 19.11 through 19.14 in conjunction with Ben.  For 19.12 and
19.13, Chuck added the new redisplay and many other display improvements
and Ben added MULE support (support for Asian and other languages) and
redesigned most of the internal Lisp subsystems to better support the
MULE work and the various other features being added to XEmacs.  After
19.14 Chuck retired as primary maintainer and Steve Baur stepped in.

   Soon after 19.13 was released, work began in earnest on the MULE
internationalization code and the source tree was divided into two
development paths.  The MULE version was initially called 19.20, but was
soon renamed to 20.0.  In 1996 Martin Buchholz of Sun Microsystems took
over the care and feeding of it and worked on it in parallel with the
19.14 development that was occurring at the same time.  After much work
by Martin, it was decided to release 20.0 ahead of 19.15 in February
1997.  The source tree remained divided until 20.2 when the version 19
source was finally retired at version 19.16.

   In 1997, Sun finally dropped all pretense of support for XEmacs and
Martin Buchholz left the company in November.  Since then, and mostly
for the previous year, because Steve Baur was never paid to work on
XEmacs, XEmacs has existed solely on the contributions of volunteers
from the Free Software Community.  Starting from 1997, Hrvoje Niksic and
Kyle Jones have figured prominently in XEmacs development.

   Many attempts have been made to merge XEmacs and GNU Emacs, but they
have consistently failed.

   A more detailed history is contained in the XEmacs About page.


File: internals.info,  Node: XEmacs From the Outside,  Next: The Lisp Language,  Prev: A History of Emacs,  Up: Top

XEmacs From the Outside
***********************

   XEmacs appears to the outside world as an editor, but it is really a
Lisp environment.  At its heart is a Lisp interpreter; it also
"happens" to contain many specialized object types (e.g. buffers,
windows, frames, events) that are useful for implementing an editor.
Some of these objects (in particular windows and frames) have
displayable representations, and XEmacs provides a function
`redisplay()' that ensures that the display of all such objects matches
their internal state.  Most of the time, a standard Lisp environment is
in a "read-eval-print" loop--i.e. "read some Lisp code, execute it, and
print the results".  XEmacs has a similar loop:

   * read an event

   * dispatch the event (i.e. "do it")

   * redisplay

   Reading an event is done using the Lisp function `next-event', which
waits for something to happen (typically, the user presses a key or
moves the mouse) and returns an event object describing this.
Dispatching an event is done using the Lisp function `dispatch-event',
which looks up the event in a keymap object (a particular kind of
object that associates an event with a Lisp function) and calls that
function.  The function "does" what the user has requested by changing
the state of particular frame objects, buffer objects, etc.  Finally,
`redisplay()' is called, which updates the display to reflect those
changes just made.  Thus is an "editor" born.

   Note that you do not have to use XEmacs as an editor; you could just
as well make it do your taxes, compute pi, play bridge, etc.  You'd just
have to write functions to do those operations in Lisp.


File: internals.info,  Node: The Lisp Language,  Next: XEmacs From the Perspective of Building,  Prev: XEmacs From the Outside,  Up: Top

The Lisp Language
*****************

   Lisp is a general-purpose language that is higher-level than C and in
many ways more powerful than C.  Powerful dialects of Lisp such as
Common Lisp are probably much better languages for writing very large
applications than is C. (Unfortunately, for many non-technical reasons
C and its successor C++ have become the dominant languages for
application development.  These languages are both inadequate for
extremely large applications, which is evidenced by the fact that newer,
larger programs are becoming ever harder to write and are requiring ever
more programmers despite great increases in C development environments;
and by the fact that, although hardware speeds and reliability have been
growing at an exponential rate, most software is still generally
considered to be slow and buggy.)

   The new Java language holds promise as a better general-purpose
development language than C.  Java has many features in common with
Lisp that are not shared by C (this is not a coincidence, since Java
was designed by James Gosling, a former Lisp hacker).  This will be
discussed more later.

   For those used to C, here is a summary of the basic differences
between C and Lisp:

  1. Lisp has an extremely regular syntax.  Every function, expression,
     and control statement is written in the form

             (FUNC ARG1 ARG2 ...)

     This is as opposed to C, which writes functions as

             func(ARG1, ARG2, ...)

     but writes expressions involving operators as (e.g.)

             ARG1 + ARG2

     and writes control statements as (e.g.)

             while (EXPR) { STATEMENT1; STATEMENT2; ... }

     Lisp equivalents of the latter two would be

             (+ ARG1 ARG2 ...)

     and

             (while EXPR STATEMENT1 STATEMENT2 ...)

  2. Lisp is a safe language.  Assuming there are no bugs in the Lisp
     interpreter/compiler, it is impossible to write a program that
     "core dumps" or otherwise causes the machine to execute an illegal
     instruction.  This is very different from C, where perhaps the most
     common outcome of a bug is exactly such a crash.  A corollary of
     this is that the C operation of casting a pointer is impossible
     (and unnecessary) in Lisp, and that it is impossible to access
     memory outside the bounds of an array.

  3. Programs and data are written in the same form.  The
     parenthesis-enclosing form described above for statements is the
     same form used for the most common data type in Lisp, the list.
     Thus, it is possible to represent any Lisp program using Lisp data
     types, and for one program to construct Lisp statements and then
     dynamically "evaluate" them, or cause them to execute.

  4. All objects are "dynamically typed".  This means that part of every
     object is an indication of what type it is.  A Lisp program can
     manipulate an object without knowing what type it is, and can
     query an object to determine its type.  This means that,
     correspondingly, variables and function parameters can hold
     objects of any type and are not normally declared as being of any
     particular type.  This is opposed to the "static typing" of C,
     where variables can hold exactly one type of object and must be
     declared as such, and objects do not contain an indication of
     their type because it's implicit in the variables they are stored
     in.  It is possible in C to have a variable hold different types
     of objects (e.g. through the use of `void *' pointers or
     variable-argument functions), but the type information must then be
     passed explicitly in some other fashion, leading to additional
     program complexity.

  5. Allocated memory is automatically reclaimed when it is no longer
     in use.  This operation is called "garbage collection" and
     involves looking through all variables to see what memory is being
     pointed to, and reclaiming any memory that is not pointed to and
     is thus "inaccessible" and out of use.  This is as opposed to C,
     in which allocated memory must be explicitly reclaimed using
     `free()'.  If you simply drop all pointers to memory without
     freeing it, it becomes "leaked" memory that still takes up space.
     Over a long period of time, this can cause your program to grow
     and grow until it runs out of memory.

  6. Lisp has built-in facilities for handling errors and exceptions.
     In C, when an error occurs, usually either the program exits
     entirely or the routine in which the error occurs returns a value
     indicating this.  If an error occurs in a deeply-nested routine,
     then every routine currently called must unwind itself normally
     and return an error value back up to the next routine.  This means
     that every routine must explicitly check for an error in all the
     routines it calls; if it does not do so, unexpected and often
     random behavior results.  This is an extremely common source of
     bugs in C programs.  An alternative would be to do a non-local
     exit using `longjmp()', but that is often very dangerous because
     the routines that were exited past had no opportunity to clean up
     after themselves and may leave things in an inconsistent state,
     causing a crash shortly afterwards.

     Lisp provides mechanisms to make such non-local exits safe.  When
     an error occurs, a routine simply signals that an error of a
     particular class has occurred, and a non-local exit takes place.
     Any routine can trap errors occurring in routines it calls by
     registering an error handler for some or all classes of errors.
     (If no handler is registered, a default handler, generally
     installed by the top-level event loop, is executed; this prints
     out the error and continues.) Routines can also specify cleanup
     code (called an "unwind-protect") that will be called when control
     exits from a block of code, no matter how that exit occurs--i.e.
     even if a function deeply nested below it causes a non-local exit
     back to the top level.

     Note that this facility has appeared in some recent vintages of C,
     in particular Visual C++ and other PC compilers written for the
     Microsoft Win32 API.

  7. In Emacs Lisp, local variables are "dynamically scoped".  This
     means that if you declare a local variable in a particular
     function, and then call another function, that subfunction can
     "see" the local variable you declared.  This is actually
     considered a bug in Emacs Lisp and in all other early dialects of
     Lisp, and was corrected in Common Lisp. (In Common Lisp, you can
     still declare dynamically scoped variables if you want to--they
     are sometimes useful--but variables by default are "lexically
     scoped" as in C.)

   For those familiar with Lisp, Emacs Lisp is modelled after MacLisp,
an early dialect of Lisp developed at MIT (no relation to the Macintosh
computer).  There is a Common Lisp compatibility package available for
Emacs that provides many of the features of Common Lisp.

   The Java language is derived in many ways from C, and shares a
similar syntax, but has the following features in common with Lisp (and
different from C):

  1. Java is a safe language, like Lisp.

  2. Java provides garbage collection, like Lisp.

  3. Java has built-in facilities for handling errors and exceptions,
     like Lisp.

  4. Java has a type system that combines the best advantages of both
     static and dynamic typing.  Objects (except very simple types) are
     explicitly marked with their type, as in dynamic typing; but there
     is a hierarchy of types and functions are declared to accept only
     certain types, thus providing the increased compile-time
     error-checking of static typing.

   The Java language also has some negative attributes:

  1. Java uses the edit/compile/run model of software development.  This
     makes it hard to use interactively.  For example, to use Java like
     `bc' it is necessary to write a special purpose, albeit tiny,
     application.  In Emacs Lisp, a calculator comes built-in without
     any effort - one can always just type an expression in the
     `*scratch*' buffer.

  2. Java tries too hard to enforce, not merely enable, portability,
     making ordinary access to standard OS facilities painful.  Java
     has an "agenda".  I think this is why `chdir' is not part of
     standard Java, which is inexcusable.

   Unfortunately, there is no perfect language.  Static typing allows a
compiler to catch programmer errors and produce more efficient code, but
makes programming more tedious and less fun.  For the foreseeable
future, an Ideal Editing and Programming Environment (and that is what
XEmacs aspires to) will be programmable in multiple languages: high
level ones like Lisp for user customization and prototyping, and lower
level ones for infrastructure and industrial strength applications.  If
I had my way, XEmacs would be friendly towards the Python, Scheme, C++,
ML, etc... communities.  But there are serious technical difficulties to
achieving that goal.

   The word "application" in the previous paragraph was used
intentionally.  XEmacs implements an API for programs written in Lisp
that makes it a full-fledged application platform, very much like an OS
inside the real OS.


File: internals.info,  Node: XEmacs From the Perspective of Building,  Next: XEmacs From the Inside,  Prev: The Lisp Language,  Up: Top

XEmacs From the Perspective of Building
***************************************

   The heart of XEmacs is the Lisp environment, which is written in C.
This is contained in the `src/' subdirectory.  Underneath `src/' are
two subdirectories of header files: `s/' (header files for particular
operating systems) and `m/' (header files for particular machine
types).  In practice the distinction between the two types of header
files is blurred.  These header files define or undefine certain
preprocessor constants and macros to indicate particular
characteristics of the associated machine or operating system.  As part
of the configure process, one `s/' file and one `m/' file is identified
for the particular environment in which XEmacs is being built.

   XEmacs also contains a great deal of Lisp code.  This implements the
operations that make XEmacs useful as an editor as well as just a Lisp
environment, and also contains many add-on packages that allow XEmacs to
browse directories, act as a mail and Usenet news reader, compile Lisp
code, etc.  There is actually more Lisp code than C code associated with
XEmacs, but much of the Lisp code is peripheral to the actual operation
of the editor.  The Lisp code all lies in subdirectories underneath the
`lisp/' directory.

   The `lwlib/' directory contains C code that implements a generalized
interface onto different X widget toolkits and also implements some
widgets of its own that behave like Motif widgets but are faster, free,
and in some cases more powerful.  The code in this directory compiles
into a library and is mostly independent from XEmacs.

   The `etc/' directory contains various data files associated with
XEmacs.  Some of them are actually read by XEmacs at startup; others
merely contain useful information of various sorts.

   The `lib-src/' directory contains C code for various auxiliary
programs that are used in connection with XEmacs.  Some of them are used
during the build process; others are used to perform certain functions
that cannot conveniently be placed in the XEmacs executable (e.g. the
`movemail' program for fetching mail out of `/var/spool/mail', which
must be setgid to `mail' on many systems; and the `gnuclient' program,
which allows an external script to communicate with a running XEmacs
process).

   The `man/' directory contains the sources for the XEmacs
documentation.  It is mostly in a form called Texinfo, which can be
converted into either a printed document (by passing it through TeX) or
into on-line documentation called "info files".

   The `info/' directory contains the results of formatting the XEmacs
documentation as "info files", for on-line use.  These files are used
when you enter the Info system using `C-h i' or through the Help menu.

   The `dynodump/' directory contains auxiliary code used to build
XEmacs on Solaris platforms.

   The other directories contain various miscellaneous code and
information that is not normally used or needed.

   The first step of building involves running the `configure' program
and passing it various parameters to specify any optional features you
want and compiler arguments and such, as described in the `INSTALL'
file.  This determines what the build environment is, chooses the
appropriate `s/' and `m/' file, and runs a series of tests to determine
many details about your environment, such as which library functions
are available and exactly how they work.  The reason for running these
tests is that it allows XEmacs to be compiled on a much wider variety
of platforms than those that the XEmacs developers happen to be
familiar with, including various sorts of hybrid platforms.  This is
especially important now that many operating systems give you a great
deal of control over exactly what features you want installed, and allow
for easy upgrading of parts of a system without upgrading the rest.  It
would be impossible to pre-determine and pre-specify the information for
all possible configurations.

   In fact, the `s/' and `m/' files are basically _evil_, since they
contain unmaintainable platform-specific hard-coded information.
XEmacs has been moving in the direction of having all system-specific
information be determined dynamically by `configure'.  Perhaps someday
we can `rm -rf src/s src/m'.

   When configure is done running, it generates `Makefile's and
`GNUmakefile's and the file `src/config.h' (which describes the
features of your system) from template files.  You then run `make',
which compiles the auxiliary code and programs in `lib-src/' and
`lwlib/' and the main XEmacs executable in `src/'.  The result of
compiling and linking is an executable called `temacs', which is _not_
the final XEmacs executable.  `temacs' by itself is not intended to
function as an editor or even display any windows on the screen, and if
you simply run it, it will exit immediately.  The `Makefile' runs
`temacs' with certain options that cause it to initialize itself, read
in a number of basic Lisp files, and then dump itself out into a new
executable called `xemacs'.  This new executable has been
pre-initialized and contains pre-digested Lisp code that is necessary
for the editor to function (this includes most basic editing functions,
e.g. `kill-line', that can be defined in terms of other Lisp
primitives; some initialization code that is called when certain
objects, such as frames, are created; and all of the standard
keybindings and code for the actions they result in).  This executable,
`xemacs', is the executable that you run to use the XEmacs editor.

   Although `temacs' is not intended to be run as an editor, it can, by
using the incantation `temacs -batch -l loadup.el run-temacs'.  This is
useful when the dumping procedure described above is broken, or when
using certain program debugging tools such as Purify.  These tools get
mighty confused by the tricks played by the XEmacs build process, such
as allocation memory in one process, and freeing it in the next.


File: internals.info,  Node: XEmacs From the Inside,  Next: The XEmacs Object System (Abstractly Speaking),  Prev: XEmacs From the Perspective of Building,  Up: Top

XEmacs From the Inside
**********************

   Internally, XEmacs is quite complex, and can be very confusing.  To
simplify things, it can be useful to think of XEmacs as containing an
event loop that "drives" everything, and a number of other subsystems,
such as a Lisp engine and a redisplay mechanism.  Each of these other
subsystems exists simultaneously in XEmacs, and each has a certain
state.  The flow of control continually passes in and out of these
different subsystems in the course of normal operation of the editor.

   It is important to keep in mind that, most of the time, the editor is
"driven" by the event loop.  Except during initialization and batch
mode, all subsystems are entered directly or indirectly through the
event loop, and ultimately, control exits out of all subsystems back up
to the event loop.  This cycle of entering a subsystem, exiting back out
to the event loop, and starting another iteration of the event loop
occurs once each keystroke, mouse motion, etc.

   If you're trying to understand a particular subsystem (other than the
event loop), think of it as a "daemon" process or "servant" that is
responsible for one particular aspect of a larger system, and
periodically receives commands or environment changes that cause it to
do something.  Ultimately, these commands and environment changes are
always triggered by the event loop.  For example:

   * The window and frame mechanism is responsible for keeping track of
     what windows and frames exist, what buffers are in them, etc.  It
     is periodically given commands (usually from the user) to make a
     change to the current window/frame state: i.e. create a new frame,
     delete a window, etc.

   * The buffer mechanism is responsible for keeping track of what
     buffers exist and what text is in them.  It is periodically given
     commands (usually from the user) to insert or delete text, create
     a buffer, etc.  When it receives a text-change command, it
     notifies the redisplay mechanism.

   * The redisplay mechanism is responsible for making sure that
     windows and frames are displayed correctly.  It is periodically
     told (by the event loop) to actually "do its job", i.e. snoop
     around and see what the current state of the environment (mostly
     of the currently-existing windows, frames, and buffers) is, and
     make sure that that state matches what's actually displayed.  It
     keeps lots and lots of information around (such as what is
     actually being displayed currently, and what the environment was
     last time it checked) so that it can minimize the work it has to
     do.  It is also helped along in that whenever a relevant change to
     the environment occurs, the redisplay mechanism is told about
     this, so it has a pretty good idea of where it has to look to find
     possible changes and doesn't have to look everywhere.

   * The Lisp engine is responsible for executing the Lisp code in
     which most user commands are written.  It is entered through a
     call to `eval' or `funcall', which occurs as a result of
     dispatching an event from the event loop.  The functions it calls
     issue commands to the buffer mechanism, the window/frame
     subsystem, etc.

   * The Lisp allocation subsystem is responsible for keeping track of
     Lisp objects.  It is given commands from the Lisp engine to
     allocate objects, garbage collect, etc.

   etc.

   The important idea here is that there are a number of independent
subsystems each with its own responsibility and persistent state, just
like different employees in a company, and each subsystem is
periodically given commands from other subsystems.  Commands can flow
from any one subsystem to any other, but there is usually some sort of
hierarchy, with all commands originating from the event subsystem.

   XEmacs is entered in `main()', which is in `emacs.c'.  When this is
called the first time (in a properly-invoked `temacs'), it does the
following:

  1. It does some very basic environment initializations, such as
     determining where it and its directories (e.g. `lisp/' and `etc/')
     reside and setting up signal handlers.

  2. It initializes the entire Lisp interpreter.

  3. It sets the initial values of many built-in variables (including
     many variables that are visible to Lisp programs), such as the
     global keymap object and the built-in faces (a face is an object
     that describes the display characteristics of text).  This
     involves creating Lisp objects and thus is dependent on step (2).

  4. It performs various other initializations that are relevant to the
     particular environment it is running in, such as retrieving
     environment variables, determining the current date and the user
     who is running the program, examining its standard input, creating
     any necessary file descriptors, etc.

  5. At this point, the C initialization is complete.  A Lisp program
     that was specified on the command line (usually `loadup.el') is
     called (temacs is normally invoked as `temacs -batch -l loadup.el
     dump').  `loadup.el' loads all of the other Lisp files that are
     needed for the operation of the editor, calls the `dump-emacs'
     function to write out `xemacs', and then kills the temacs process.

   When `xemacs' is then run, it only redoes steps (1) and (4) above;
all variables already contain the values they were set to when the
executable was dumped, and all memory that was allocated with
`malloc()' is still around. (XEmacs knows whether it is being run as
`xemacs' or `temacs' because it sets the global variable `initialized'
to 1 after step (4) above.) At this point, `xemacs' calls a Lisp
function to do any further initialization, which includes parsing the
command-line (the C code can only do limited command-line parsing,
which includes looking for the `-batch' and `-l' flags and a few other
flags that it needs to know about before initialization is complete),
creating the first frame (or "window" in standard window-system
parlance), running the user's init file (usually the file `.emacs' in
the user's home directory), etc.  The function to do this is usually
called `normal-top-level'; `loadup.el' tells the C code about this
function by setting its name as the value of the Lisp variable
`top-level'.

   When the Lisp initialization code is done, the C code enters the
event loop, and stays there for the duration of the XEmacs process.
The code for the event loop is contained in `cmdloop.c', and is called
`Fcommand_loop_1()'.  Note that this event loop could very well be
written in Lisp, and in fact a Lisp version exists; but apparently,
doing this makes XEmacs run noticeably slower.

   Notice how much of the initialization is done in Lisp, not in C.  In
general, XEmacs tries to move as much code as is possible into Lisp.
Code that remains in C is code that implements the Lisp interpreter
itself, or code that needs to be very fast, or code that needs to do
system calls or other such stuff that needs to be done in C, or code
that needs to have access to "forbidden" structures. (One conscious
aspect of the design of Lisp under XEmacs is a clean separation between
the external interface to a Lisp object's functionality and its internal
implementation.  Part of this design is that Lisp programs are
forbidden from accessing the contents of the object other than through
using a standard API.  In this respect, XEmacs Lisp is similar to
modern Lisp dialects but differs from GNU Emacs, which tends to expose
the implementation and allow Lisp programs to look at it directly.  The
major advantage of hiding the implementation is that it allows the
implementation to be redesigned without affecting any Lisp programs,
including those that might want to be "clever" by looking directly at
the object's contents and possibly manipulating them.)

   Moving code into Lisp makes the code easier to debug and maintain and
makes it much easier for people who are not XEmacs developers to
customize XEmacs, because they can make a change with much less chance
of obscure and unwanted interactions occurring than if they were to
change the C code.