This is ../info/internals.info, produced by makeinfo version 4.0 from internals/internals.texi. INFO-DIR-SECTION XEmacs Editor START-INFO-DIR-ENTRY * Internals: (internals). XEmacs Internals Manual. END-INFO-DIR-ENTRY Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun Microsystems. Copyright (C) 1994 - 1998 Free Software Foundation. Copyright (C) 1994, 1995 Board of Trustees, University of Illinois. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the section entitled "GNU General Public License" is included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that the section entitled "GNU General Public License" may be included in a translation approved by the Free Software Foundation instead of in the original English.  File: internals.info, Node: Top, Next: A History of Emacs, Prev: (dir), Up: (dir) This Info file contains v1.0 of the XEmacs Internals Manual. * Menu: * A History of Emacs:: Times, dates, important events. * XEmacs From the Outside:: A broad conceptual overview. * The Lisp Language:: An overview. * XEmacs From the Perspective of Building:: * XEmacs From the Inside:: * The XEmacs Object System (Abstractly Speaking):: * How Lisp Objects Are Represented in C:: * Rules When Writing New C Code:: * A Summary of the Various XEmacs Modules:: * Allocation of Objects in XEmacs Lisp:: * Dumping:: * Events and the Event Loop:: * Evaluation; Stack Frames; Bindings:: * Symbols and Variables:: * Buffers and Textual Representation:: * MULE Character Sets and Encodings:: * The Lisp Reader and Compiler:: * Lstreams:: * Consoles; Devices; Frames; Windows:: * The Redisplay Mechanism:: * Extents:: * Faces:: * Glyphs:: * Specifiers:: * Menus:: * Subprocesses:: * Interface to the X Window System:: * Index:: --- The Detailed Node Listing --- A History of Emacs * Through Version 18:: Unification prevails. * Lucid Emacs:: One version 19 Emacs. * GNU Emacs 19:: The other version 19 Emacs. * GNU Emacs 20:: The other version 20 Emacs. * XEmacs:: The continuation of Lucid Emacs. Rules When Writing New C Code * General Coding Rules:: * Writing Lisp Primitives:: * Adding Global Lisp Variables:: * Coding for Mule:: * Techniques for XEmacs Developers:: Coding for Mule * Character-Related Data Types:: * Working With Character and Byte Positions:: * Conversion to and from External Data:: * General Guidelines for Writing Mule-Aware Code:: * An Example of Mule-Aware Code:: A Summary of the Various XEmacs Modules * Low-Level Modules:: * Basic Lisp Modules:: * Modules for Standard Editing Operations:: * Editor-Level Control Flow Modules:: * Modules for the Basic Displayable Lisp Objects:: * Modules for other Display-Related Lisp Objects:: * Modules for the Redisplay Mechanism:: * Modules for Interfacing with the File System:: * Modules for Other Aspects of the Lisp Interpreter and Object System:: * Modules for Interfacing with the Operating System:: * Modules for Interfacing with X Windows:: * Modules for Internationalization:: Allocation of Objects in XEmacs Lisp * Introduction to Allocation:: * Garbage Collection:: * GCPROing:: * Garbage Collection - Step by Step:: * Integers and Characters:: * Allocation from Frob Blocks:: * lrecords:: * Low-level allocation:: * Cons:: * Vector:: * Bit Vector:: * Symbol:: * Marker:: * String:: * Compiled Function:: Garbage Collection - Step by Step * Invocation:: * garbage_collect_1:: * mark_object:: * gc_sweep:: * sweep_lcrecords_1:: * compact_string_chars:: * sweep_strings:: * sweep_bit_vectors_1:: Dumping * Overview:: * Data descriptions:: * Dumping phase:: * Reloading phase:: Dumping phase * Object inventory:: * Address allocation:: * The header:: * Data dumping:: * Pointers dumping:: Events and the Event Loop * Introduction to Events:: * Main Loop:: * Specifics of the Event Gathering Mechanism:: * Specifics About the Emacs Event:: * The Event Stream Callback Routines:: * Other Event Loop Functions:: * Converting Events:: * Dispatching Events; The Command Builder:: Evaluation; Stack Frames; Bindings * Evaluation:: * Dynamic Binding; The specbinding Stack; Unwind-Protects:: * Simple Special Forms:: * Catch and Throw:: Symbols and Variables * Introduction to Symbols:: * Obarrays:: * Symbol Values:: Buffers and Textual Representation * Introduction to Buffers:: A buffer holds a block of text such as a file. * The Text in a Buffer:: Representation of the text in a buffer. * Buffer Lists:: Keeping track of all buffers. * Markers and Extents:: Tagging locations within a buffer. * Bufbytes and Emchars:: Representation of individual characters. * The Buffer Object:: The Lisp object corresponding to a buffer. MULE Character Sets and Encodings * Character Sets:: * Encodings:: * Internal Mule Encodings:: * CCL:: Encodings * Japanese EUC (Extended Unix Code):: * JIS7:: Internal Mule Encodings * Internal String Encoding:: * Internal Character Encoding:: Lstreams * Creating an Lstream:: Creating an lstream object. * Lstream Types:: Different sorts of things that are streamed. * Lstream Functions:: Functions for working with lstreams. * Lstream Methods:: Creating new lstream types. Consoles; Devices; Frames; Windows * Introduction to Consoles; Devices; Frames; Windows:: * Point:: * Window Hierarchy:: * The Window Object:: The Redisplay Mechanism * Critical Redisplay Sections:: * Line Start Cache:: * Redisplay Piece by Piece:: Extents * Introduction to Extents:: Extents are ranges over text, with properties. * Extent Ordering:: How extents are ordered internally. * Format of the Extent Info:: The extent information in a buffer or string. * Zero-Length Extents:: A weird special case. * Mathematics of Extent Ordering:: A rigorous foundation. * Extent Fragments:: Cached information useful for redisplay.  File: internals.info, Node: A History of Emacs, Next: XEmacs From the Outside, Prev: Top, Up: Top A History of Emacs ****************** XEmacs is a powerful, customizable text editor and development environment. It began as Lucid Emacs, which was in turn derived from GNU Emacs, a program written by Richard Stallman of the Free Software Foundation. GNU Emacs dates back to the 1970's, and was modelled after a package called "Emacs", written in 1976, that was a set of macros on top of TECO, an old, old text editor written at MIT on the DEC PDP 10 under one of the earliest time-sharing operating systems, ITS (Incompatible Timesharing System). (ITS dates back well before Unix.) ITS, TECO, and Emacs were products of a group of people at MIT who called themselves "hackers", who shared an idealistic belief system about the free exchange of information and were fanatical in their devotion to and time spent with computers. (The hacker subculture dates back to the late 1950's at MIT and is described in detail in Steven Levy's book `Hackers'. This book also includes a lot of information about Stallman himself and the development of Lisp, a programming language developed at MIT that underlies Emacs.) * Menu: * Through Version 18:: Unification prevails. * Lucid Emacs:: One version 19 Emacs. * GNU Emacs 19:: The other version 19 Emacs. * GNU Emacs 20:: The other version 20 Emacs. * XEmacs:: The continuation of Lucid Emacs.  File: internals.info, Node: Through Version 18, Next: Lucid Emacs, Prev: A History of Emacs, Up: A History of Emacs Through Version 18 ================== Although the history of the early versions of GNU Emacs is unclear, the history is well-known from the middle of 1985. A time line is: * GNU Emacs version 15 (15.34) was released sometime in 1984 or 1985 and shared some code with a version of Emacs written by James Gosling (the same James Gosling who later created the Java language). * GNU Emacs version 16 (first released version was 16.56) was released on July 15, 1985. All Gosling code was removed due to potential copyright problems with the code. * version 16.57: released on September 16, 1985. * versions 16.58, 16.59: released on September 17, 1985. * version 16.60: released on September 19, 1985. These later version 16's incorporated patches from the net, esp. for getting Emacs to work under System V. * version 17.36 (first official v17 release) released on December 20, 1985. Included a TeX-able user manual. First official unpatched version that worked on vanilla System V machines. * version 17.43 (second official v17 release) released on January 25, 1986. * version 17.45 released on January 30, 1986. * version 17.46 released on February 4, 1986. * version 17.48 released on February 10, 1986. * version 17.49 released on February 12, 1986. * version 17.55 released on March 18, 1986. * version 17.57 released on March 27, 1986. * version 17.58 released on April 4, 1986. * version 17.61 released on April 12, 1986. * version 17.63 released on May 7, 1986. * version 17.64 released on May 12, 1986. * version 18.24 (a beta version) released on October 2, 1986. * version 18.30 (a beta version) released on November 15, 1986. * version 18.31 (a beta version) released on November 23, 1986. * version 18.32 (a beta version) released on December 7, 1986. * version 18.33 (a beta version) released on December 12, 1986. * version 18.35 (a beta version) released on January 5, 1987. * version 18.36 (a beta version) released on January 21, 1987. * January 27, 1987: The Great Usenet Renaming. net.emacs is now comp.emacs. * version 18.37 (a beta version) released on February 12, 1987. * version 18.38 (a beta version) released on March 3, 1987. * version 18.39 (a beta version) released on March 14, 1987. * version 18.40 (a beta version) released on March 18, 1987. * version 18.41 (the first "official" release) released on March 22, 1987. * version 18.45 released on June 2, 1987. * version 18.46 released on June 9, 1987. * version 18.47 released on June 18, 1987. * version 18.48 released on September 3, 1987. * version 18.49 released on September 18, 1987. * version 18.50 released on February 13, 1988. * version 18.51 released on May 7, 1988. * version 18.52 released on September 1, 1988. * version 18.53 released on February 24, 1989. * version 18.54 released on April 26, 1989. * version 18.55 released on August 23, 1989. This is the earliest version that is still available by FTP. * version 18.56 released on January 17, 1991. * version 18.57 released late January, 1991. * version 18.58 released ?????. * version 18.59 released October 31, 1992.  File: internals.info, Node: Lucid Emacs, Next: GNU Emacs 19, Prev: Through Version 18, Up: A History of Emacs Lucid Emacs =========== Lucid Emacs was developed by the (now-defunct) Lucid Inc., a maker of C++ and Lisp development environments. It began when Lucid decided they wanted to use Emacs as the editor and cornerstone of their C++ development environment (called "Energize"). They needed many features that were not available in the existing version of GNU Emacs (version 18.5something), in particular good and integrated support for GUI elements such as mouse support, multiple fonts, multiple window-system windows, etc. A branch of GNU Emacs called Epoch, written at the University of Illinois, existed that supplied many of these features; however, Lucid needed more than what existed in Epoch. At the time, the Free Software Foundation was working on version 19 of Emacs (this was sometime around 1991), which was planned to have similar features, and so Lucid decided to work with the Free Software Foundation. Their plan was to add features that they needed, and coordinate with the FSF so that the features would get included back into Emacs version 19. Delays in the release of version 19 occurred, however (resulting in it finally being released more than a year after what was initially planned), and Lucid encountered unexpected technical resistance in getting their changes merged back into version 19, so they decided to release their own version of Emacs, which became Lucid Emacs 19.0. The initial authors of Lucid Emacs were Matthieu Devin, Harlan Sexton, and Eric Benson, and the work was later taken over by Jamie Zawinski, who became "Mr. Lucid Emacs" for many releases. A time line for Lucid Emacs/XEmacs is * version 19.0 shipped with Energize 1.0, April 1992. * version 19.1 released June 4, 1992. * version 19.2 released June 19, 1992. * version 19.3 released September 9, 1992. * version 19.4 released January 21, 1993. * version 19.5 was a repackaging of 19.4 with a few bug fixes and shipped with Energize 2.0. Never released to the net. * version 19.6 released April 9, 1993. * version 19.7 was a repackaging of 19.6 with a few bug fixes and shipped with Energize 2.1. Never released to the net. * version 19.8 released September 6, 1993. * version 19.9 released January 12, 1994. * version 19.10 released May 27, 1994. * version 19.11 (first XEmacs) released September 13, 1994. * version 19.12 released June 23, 1995. * version 19.13 released September 1, 1995. * version 19.14 released June 23, 1996. * version 20.0 released February 9, 1997. * version 19.15 released March 28, 1997. * version 20.1 (not released to the net) April 15, 1997. * version 20.2 released May 16, 1997. * version 19.16 released October 31, 1997. * version 20.3 (the first stable version of XEmacs 20.x) released November 30, 1997. version 20.4 released February 28, 1998.  File: internals.info, Node: GNU Emacs 19, Next: GNU Emacs 20, Prev: Lucid Emacs, Up: A History of Emacs GNU Emacs 19 ============ About a year after the initial release of Lucid Emacs, the FSF released a beta of their version of Emacs 19 (referred to here as "GNU Emacs"). By this time, the current version of Lucid Emacs was 19.6. (Strangely, the first released beta from the FSF was GNU Emacs 19.7.) A time line for GNU Emacs version 19 is * version 19.8 (beta) released May 27, 1993. * version 19.9 (beta) released May 27, 1993. * version 19.10 (beta) released May 30, 1993. * version 19.11 (beta) released June 1, 1993. * version 19.12 (beta) released June 2, 1993. * version 19.13 (beta) released June 8, 1993. * version 19.14 (beta) released June 17, 1993. * version 19.15 (beta) released June 19, 1993. * version 19.16 (beta) released July 6, 1993. * version 19.17 (beta) released late July, 1993. * version 19.18 (beta) released August 9, 1993. * version 19.19 (beta) released August 15, 1993. * version 19.20 (beta) released November 17, 1993. * version 19.21 (beta) released November 17, 1993. * version 19.22 (beta) released November 28, 1993. * version 19.23 (beta) released May 17, 1994. * version 19.24 (beta) released May 16, 1994. * version 19.25 (beta) released June 3, 1994. * version 19.26 (beta) released September 11, 1994. * version 19.27 (beta) released September 14, 1994. * version 19.28 (first "official" release) released November 1, 1994. * version 19.29 released June 21, 1995. * version 19.30 released November 24, 1995. * version 19.31 released May 25, 1996. * version 19.32 released July 31, 1996. * version 19.33 released August 11, 1996. * version 19.34 released August 21, 1996. * version 19.34b released September 6, 1996. In some ways, GNU Emacs 19 was better than Lucid Emacs; in some ways, worse. Lucid soon began incorporating features from GNU Emacs 19 into Lucid Emacs; the work was mostly done by Richard Mlynarik, who had been working on and using GNU Emacs for a long time (back as far as version 16 or 17).  File: internals.info, Node: GNU Emacs 20, Next: XEmacs, Prev: GNU Emacs 19, Up: A History of Emacs GNU Emacs 20 ============ On February 2, 1997 work began on GNU Emacs to integrate Mule. The first release was made in September of that year. A timeline for Emacs 20 is * version 20.1 released September 17, 1997. * version 20.2 released September 20, 1997. * version 20.3 released August 19, 1998.  File: internals.info, Node: XEmacs, Prev: GNU Emacs 20, Up: A History of Emacs XEmacs ====== Around the time that Lucid was developing Energize, Sun Microsystems was developing their own development environment (called "SPARCWorks") and also decided to use Emacs. They joined forces with the Epoch team at the University of Illinois and later with Lucid. The maintainer of the last-released version of Epoch was Marc Andreessen, but he dropped out and the Epoch project, headed by Simon Kaplan, lured Chuck Thompson away from a system administration job to become the primary Lucid Emacs author for Epoch and Sun. Chuck's area of specialty became the redisplay engine (he replaced the old Lucid Emacs redisplay engine with a ported version from Epoch and then later rewrote it from scratch). Sun also hired Ben Wing (the author of Win-Emacs, a port of Lucid Emacs to Microsoft Windows 3.1) in 1993, for what was initially a one-month contract to fix some event problems but later became a many-year involvement, punctuated by a six-month contract with Amdahl Corporation. In 1994, Sun and Lucid agreed to rename Lucid Emacs to XEmacs (a name not favorable to either company); the first release called XEmacs was version 19.11. In June 1994, Lucid folded and Jamie quit to work for the newly formed Mosaic Communications Corp., later Netscape Communications Corp. (co-founded by the same Marc Andreessen, who had quit his Epoch job to work on a graphical browser for the World Wide Web). Chuck then become the primary maintainer of XEmacs, and put out versions 19.11 through 19.14 in conjunction with Ben. For 19.12 and 19.13, Chuck added the new redisplay and many other display improvements and Ben added MULE support (support for Asian and other languages) and redesigned most of the internal Lisp subsystems to better support the MULE work and the various other features being added to XEmacs. After 19.14 Chuck retired as primary maintainer and Steve Baur stepped in. Soon after 19.13 was released, work began in earnest on the MULE internationalization code and the source tree was divided into two development paths. The MULE version was initially called 19.20, but was soon renamed to 20.0. In 1996 Martin Buchholz of Sun Microsystems took over the care and feeding of it and worked on it in parallel with the 19.14 development that was occurring at the same time. After much work by Martin, it was decided to release 20.0 ahead of 19.15 in February 1997. The source tree remained divided until 20.2 when the version 19 source was finally retired at version 19.16. In 1997, Sun finally dropped all pretense of support for XEmacs and Martin Buchholz left the company in November. Since then, and mostly for the previous year, because Steve Baur was never paid to work on XEmacs, XEmacs has existed solely on the contributions of volunteers from the Free Software Community. Starting from 1997, Hrvoje Niksic and Kyle Jones have figured prominently in XEmacs development. Many attempts have been made to merge XEmacs and GNU Emacs, but they have consistently failed. A more detailed history is contained in the XEmacs About page.  File: internals.info, Node: XEmacs From the Outside, Next: The Lisp Language, Prev: A History of Emacs, Up: Top XEmacs From the Outside *********************** XEmacs appears to the outside world as an editor, but it is really a Lisp environment. At its heart is a Lisp interpreter; it also "happens" to contain many specialized object types (e.g. buffers, windows, frames, events) that are useful for implementing an editor. Some of these objects (in particular windows and frames) have displayable representations, and XEmacs provides a function `redisplay()' that ensures that the display of all such objects matches their internal state. Most of the time, a standard Lisp environment is in a "read-eval-print" loop--i.e. "read some Lisp code, execute it, and print the results". XEmacs has a similar loop: * read an event * dispatch the event (i.e. "do it") * redisplay Reading an event is done using the Lisp function `next-event', which waits for something to happen (typically, the user presses a key or moves the mouse) and returns an event object describing this. Dispatching an event is done using the Lisp function `dispatch-event', which looks up the event in a keymap object (a particular kind of object that associates an event with a Lisp function) and calls that function. The function "does" what the user has requested by changing the state of particular frame objects, buffer objects, etc. Finally, `redisplay()' is called, which updates the display to reflect those changes just made. Thus is an "editor" born. Note that you do not have to use XEmacs as an editor; you could just as well make it do your taxes, compute pi, play bridge, etc. You'd just have to write functions to do those operations in Lisp.  File: internals.info, Node: The Lisp Language, Next: XEmacs From the Perspective of Building, Prev: XEmacs From the Outside, Up: Top The Lisp Language ***************** Lisp is a general-purpose language that is higher-level than C and in many ways more powerful than C. Powerful dialects of Lisp such as Common Lisp are probably much better languages for writing very large applications than is C. (Unfortunately, for many non-technical reasons C and its successor C++ have become the dominant languages for application development. These languages are both inadequate for extremely large applications, which is evidenced by the fact that newer, larger programs are becoming ever harder to write and are requiring ever more programmers despite great increases in C development environments; and by the fact that, although hardware speeds and reliability have been growing at an exponential rate, most software is still generally considered to be slow and buggy.) The new Java language holds promise as a better general-purpose development language than C. Java has many features in common with Lisp that are not shared by C (this is not a coincidence, since Java was designed by James Gosling, a former Lisp hacker). This will be discussed more later. For those used to C, here is a summary of the basic differences between C and Lisp: 1. Lisp has an extremely regular syntax. Every function, expression, and control statement is written in the form (FUNC ARG1 ARG2 ...) This is as opposed to C, which writes functions as func(ARG1, ARG2, ...) but writes expressions involving operators as (e.g.) ARG1 + ARG2 and writes control statements as (e.g.) while (EXPR) { STATEMENT1; STATEMENT2; ... } Lisp equivalents of the latter two would be (+ ARG1 ARG2 ...) and (while EXPR STATEMENT1 STATEMENT2 ...) 2. Lisp is a safe language. Assuming there are no bugs in the Lisp interpreter/compiler, it is impossible to write a program that "core dumps" or otherwise causes the machine to execute an illegal instruction. This is very different from C, where perhaps the most common outcome of a bug is exactly such a crash. A corollary of this is that the C operation of casting a pointer is impossible (and unnecessary) in Lisp, and that it is impossible to access memory outside the bounds of an array. 3. Programs and data are written in the same form. The parenthesis-enclosing form described above for statements is the same form used for the most common data type in Lisp, the list. Thus, it is possible to represent any Lisp program using Lisp data types, and for one program to construct Lisp statements and then dynamically "evaluate" them, or cause them to execute. 4. All objects are "dynamically typed". This means that part of every object is an indication of what type it is. A Lisp program can manipulate an object without knowing what type it is, and can query an object to determine its type. This means that, correspondingly, variables and function parameters can hold objects of any type and are not normally declared as being of any particular type. This is opposed to the "static typing" of C, where variables can hold exactly one type of object and must be declared as such, and objects do not contain an indication of their type because it's implicit in the variables they are stored in. It is possible in C to have a variable hold different types of objects (e.g. through the use of `void *' pointers or variable-argument functions), but the type information must then be passed explicitly in some other fashion, leading to additional program complexity. 5. Allocated memory is automatically reclaimed when it is no longer in use. This operation is called "garbage collection" and involves looking through all variables to see what memory is being pointed to, and reclaiming any memory that is not pointed to and is thus "inaccessible" and out of use. This is as opposed to C, in which allocated memory must be explicitly reclaimed using `free()'. If you simply drop all pointers to memory without freeing it, it becomes "leaked" memory that still takes up space. Over a long period of time, this can cause your program to grow and grow until it runs out of memory. 6. Lisp has built-in facilities for handling errors and exceptions. In C, when an error occurs, usually either the program exits entirely or the routine in which the error occurs returns a value indicating this. If an error occurs in a deeply-nested routine, then every routine currently called must unwind itself normally and return an error value back up to the next routine. This means that every routine must explicitly check for an error in all the routines it calls; if it does not do so, unexpected and often random behavior results. This is an extremely common source of bugs in C programs. An alternative would be to do a non-local exit using `longjmp()', but that is often very dangerous because the routines that were exited past had no opportunity to clean up after themselves and may leave things in an inconsistent state, causing a crash shortly afterwards. Lisp provides mechanisms to make such non-local exits safe. When an error occurs, a routine simply signals that an error of a particular class has occurred, and a non-local exit takes place. Any routine can trap errors occurring in routines it calls by registering an error handler for some or all classes of errors. (If no handler is registered, a default handler, generally installed by the top-level event loop, is executed; this prints out the error and continues.) Routines can also specify cleanup code (called an "unwind-protect") that will be called when control exits from a block of code, no matter how that exit occurs--i.e. even if a function deeply nested below it causes a non-local exit back to the top level. Note that this facility has appeared in some recent vintages of C, in particular Visual C++ and other PC compilers written for the Microsoft Win32 API. 7. In Emacs Lisp, local variables are "dynamically scoped". This means that if you declare a local variable in a particular function, and then call another function, that subfunction can "see" the local variable you declared. This is actually considered a bug in Emacs Lisp and in all other early dialects of Lisp, and was corrected in Common Lisp. (In Common Lisp, you can still declare dynamically scoped variables if you want to--they are sometimes useful--but variables by default are "lexically scoped" as in C.) For those familiar with Lisp, Emacs Lisp is modelled after MacLisp, an early dialect of Lisp developed at MIT (no relation to the Macintosh computer). There is a Common Lisp compatibility package available for Emacs that provides many of the features of Common Lisp. The Java language is derived in many ways from C, and shares a similar syntax, but has the following features in common with Lisp (and different from C): 1. Java is a safe language, like Lisp. 2. Java provides garbage collection, like Lisp. 3. Java has built-in facilities for handling errors and exceptions, like Lisp. 4. Java has a type system that combines the best advantages of both static and dynamic typing. Objects (except very simple types) are explicitly marked with their type, as in dynamic typing; but there is a hierarchy of types and functions are declared to accept only certain types, thus providing the increased compile-time error-checking of static typing. The Java language also has some negative attributes: 1. Java uses the edit/compile/run model of software development. This makes it hard to use interactively. For example, to use Java like `bc' it is necessary to write a special purpose, albeit tiny, application. In Emacs Lisp, a calculator comes built-in without any effort - one can always just type an expression in the `*scratch*' buffer. 2. Java tries too hard to enforce, not merely enable, portability, making ordinary access to standard OS facilities painful. Java has an "agenda". I think this is why `chdir' is not part of standard Java, which is inexcusable. Unfortunately, there is no perfect language. Static typing allows a compiler to catch programmer errors and produce more efficient code, but makes programming more tedious and less fun. For the foreseeable future, an Ideal Editing and Programming Environment (and that is what XEmacs aspires to) will be programmable in multiple languages: high level ones like Lisp for user customization and prototyping, and lower level ones for infrastructure and industrial strength applications. If I had my way, XEmacs would be friendly towards the Python, Scheme, C++, ML, etc... communities. But there are serious technical difficulties to achieving that goal. The word "application" in the previous paragraph was used intentionally. XEmacs implements an API for programs written in Lisp that makes it a full-fledged application platform, very much like an OS inside the real OS.  File: internals.info, Node: XEmacs From the Perspective of Building, Next: XEmacs From the Inside, Prev: The Lisp Language, Up: Top XEmacs From the Perspective of Building *************************************** The heart of XEmacs is the Lisp environment, which is written in C. This is contained in the `src/' subdirectory. Underneath `src/' are two subdirectories of header files: `s/' (header files for particular operating systems) and `m/' (header files for particular machine types). In practice the distinction between the two types of header files is blurred. These header files define or undefine certain preprocessor constants and macros to indicate particular characteristics of the associated machine or operating system. As part of the configure process, one `s/' file and one `m/' file is identified for the particular environment in which XEmacs is being built. XEmacs also contains a great deal of Lisp code. This implements the operations that make XEmacs useful as an editor as well as just a Lisp environment, and also contains many add-on packages that allow XEmacs to browse directories, act as a mail and Usenet news reader, compile Lisp code, etc. There is actually more Lisp code than C code associated with XEmacs, but much of the Lisp code is peripheral to the actual operation of the editor. The Lisp code all lies in subdirectories underneath the `lisp/' directory. The `lwlib/' directory contains C code that implements a generalized interface onto different X widget toolkits and also implements some widgets of its own that behave like Motif widgets but are faster, free, and in some cases more powerful. The code in this directory compiles into a library and is mostly independent from XEmacs. The `etc/' directory contains various data files associated with XEmacs. Some of them are actually read by XEmacs at startup; others merely contain useful information of various sorts. The `lib-src/' directory contains C code for various auxiliary programs that are used in connection with XEmacs. Some of them are used during the build process; others are used to perform certain functions that cannot conveniently be placed in the XEmacs executable (e.g. the `movemail' program for fetching mail out of `/var/spool/mail', which must be setgid to `mail' on many systems; and the `gnuclient' program, which allows an external script to communicate with a running XEmacs process). The `man/' directory contains the sources for the XEmacs documentation. It is mostly in a form called Texinfo, which can be converted into either a printed document (by passing it through TeX) or into on-line documentation called "info files". The `info/' directory contains the results of formatting the XEmacs documentation as "info files", for on-line use. These files are used when you enter the Info system using `C-h i' or through the Help menu. The `dynodump/' directory contains auxiliary code used to build XEmacs on Solaris platforms. The other directories contain various miscellaneous code and information that is not normally used or needed. The first step of building involves running the `configure' program and passing it various parameters to specify any optional features you want and compiler arguments and such, as described in the `INSTALL' file. This determines what the build environment is, chooses the appropriate `s/' and `m/' file, and runs a series of tests to determine many details about your environment, such as which library functions are available and exactly how they work. The reason for running these tests is that it allows XEmacs to be compiled on a much wider variety of platforms than those that the XEmacs developers happen to be familiar with, including various sorts of hybrid platforms. This is especially important now that many operating systems give you a great deal of control over exactly what features you want installed, and allow for easy upgrading of parts of a system without upgrading the rest. It would be impossible to pre-determine and pre-specify the information for all possible configurations. In fact, the `s/' and `m/' files are basically _evil_, since they contain unmaintainable platform-specific hard-coded information. XEmacs has been moving in the direction of having all system-specific information be determined dynamically by `configure'. Perhaps someday we can `rm -rf src/s src/m'. When configure is done running, it generates `Makefile's and `GNUmakefile's and the file `src/config.h' (which describes the features of your system) from template files. You then run `make', which compiles the auxiliary code and programs in `lib-src/' and `lwlib/' and the main XEmacs executable in `src/'. The result of compiling and linking is an executable called `temacs', which is _not_ the final XEmacs executable. `temacs' by itself is not intended to function as an editor or even display any windows on the screen, and if you simply run it, it will exit immediately. The `Makefile' runs `temacs' with certain options that cause it to initialize itself, read in a number of basic Lisp files, and then dump itself out into a new executable called `xemacs'. This new executable has been pre-initialized and contains pre-digested Lisp code that is necessary for the editor to function (this includes most basic editing functions, e.g. `kill-line', that can be defined in terms of other Lisp primitives; some initialization code that is called when certain objects, such as frames, are created; and all of the standard keybindings and code for the actions they result in). This executable, `xemacs', is the executable that you run to use the XEmacs editor. Although `temacs' is not intended to be run as an editor, it can, by using the incantation `temacs -batch -l loadup.el run-temacs'. This is useful when the dumping procedure described above is broken, or when using certain program debugging tools such as Purify. These tools get mighty confused by the tricks played by the XEmacs build process, such as allocation memory in one process, and freeing it in the next.  File: internals.info, Node: XEmacs From the Inside, Next: The XEmacs Object System (Abstractly Speaking), Prev: XEmacs From the Perspective of Building, Up: Top XEmacs From the Inside ********************** Internally, XEmacs is quite complex, and can be very confusing. To simplify things, it can be useful to think of XEmacs as containing an event loop that "drives" everything, and a number of other subsystems, such as a Lisp engine and a redisplay mechanism. Each of these other subsystems exists simultaneously in XEmacs, and each has a certain state. The flow of control continually passes in and out of these different subsystems in the course of normal operation of the editor. It is important to keep in mind that, most of the time, the editor is "driven" by the event loop. Except during initialization and batch mode, all subsystems are entered directly or indirectly through the event loop, and ultimately, control exits out of all subsystems back up to the event loop. This cycle of entering a subsystem, exiting back out to the event loop, and starting another iteration of the event loop occurs once each keystroke, mouse motion, etc. If you're trying to understand a particular subsystem (other than the event loop), think of it as a "daemon" process or "servant" that is responsible for one particular aspect of a larger system, and periodically receives commands or environment changes that cause it to do something. Ultimately, these commands and environment changes are always triggered by the event loop. For example: * The window and frame mechanism is responsible for keeping track of what windows and frames exist, what buffers are in them, etc. It is periodically given commands (usually from the user) to make a change to the current window/frame state: i.e. create a new frame, delete a window, etc. * The buffer mechanism is responsible for keeping track of what buffers exist and what text is in them. It is periodically given commands (usually from the user) to insert or delete text, create a buffer, etc. When it receives a text-change command, it notifies the redisplay mechanism. * The redisplay mechanism is responsible for making sure that windows and frames are displayed correctly. It is periodically told (by the event loop) to actually "do its job", i.e. snoop around and see what the current state of the environment (mostly of the currently-existing windows, frames, and buffers) is, and make sure that that state matches what's actually displayed. It keeps lots and lots of information around (such as what is actually being displayed currently, and what the environment was last time it checked) so that it can minimize the work it has to do. It is also helped along in that whenever a relevant change to the environment occurs, the redisplay mechanism is told about this, so it has a pretty good idea of where it has to look to find possible changes and doesn't have to look everywhere. * The Lisp engine is responsible for executing the Lisp code in which most user commands are written. It is entered through a call to `eval' or `funcall', which occurs as a result of dispatching an event from the event loop. The functions it calls issue commands to the buffer mechanism, the window/frame subsystem, etc. * The Lisp allocation subsystem is responsible for keeping track of Lisp objects. It is given commands from the Lisp engine to allocate objects, garbage collect, etc. etc. The important idea here is that there are a number of independent subsystems each with its own responsibility and persistent state, just like different employees in a company, and each subsystem is periodically given commands from other subsystems. Commands can flow from any one subsystem to any other, but there is usually some sort of hierarchy, with all commands originating from the event subsystem. XEmacs is entered in `main()', which is in `emacs.c'. When this is called the first time (in a properly-invoked `temacs'), it does the following: 1. It does some very basic environment initializations, such as determining where it and its directories (e.g. `lisp/' and `etc/') reside and setting up signal handlers. 2. It initializes the entire Lisp interpreter. 3. It sets the initial values of many built-in variables (including many variables that are visible to Lisp programs), such as the global keymap object and the built-in faces (a face is an object that describes the display characteristics of text). This involves creating Lisp objects and thus is dependent on step (2). 4. It performs various other initializations that are relevant to the particular environment it is running in, such as retrieving environment variables, determining the current date and the user who is running the program, examining its standard input, creating any necessary file descriptors, etc. 5. At this point, the C initialization is complete. A Lisp program that was specified on the command line (usually `loadup.el') is called (temacs is normally invoked as `temacs -batch -l loadup.el dump'). `loadup.el' loads all of the other Lisp files that are needed for the operation of the editor, calls the `dump-emacs' function to write out `xemacs', and then kills the temacs process. When `xemacs' is then run, it only redoes steps (1) and (4) above; all variables already contain the values they were set to when the executable was dumped, and all memory that was allocated with `malloc()' is still around. (XEmacs knows whether it is being run as `xemacs' or `temacs' because it sets the global variable `initialized' to 1 after step (4) above.) At this point, `xemacs' calls a Lisp function to do any further initialization, which includes parsing the command-line (the C code can only do limited command-line parsing, which includes looking for the `-batch' and `-l' flags and a few other flags that it needs to know about before initialization is complete), creating the first frame (or "window" in standard window-system parlance), running the user's init file (usually the file `.emacs' in the user's home directory), etc. The function to do this is usually called `normal-top-level'; `loadup.el' tells the C code about this function by setting its name as the value of the Lisp variable `top-level'. When the Lisp initialization code is done, the C code enters the event loop, and stays there for the duration of the XEmacs process. The code for the event loop is contained in `cmdloop.c', and is called `Fcommand_loop_1()'. Note that this event loop could very well be written in Lisp, and in fact a Lisp version exists; but apparently, doing this makes XEmacs run noticeably slower. Notice how much of the initialization is done in Lisp, not in C. In general, XEmacs tries to move as much code as is possible into Lisp. Code that remains in C is code that implements the Lisp interpreter itself, or code that needs to be very fast, or code that needs to do system calls or other such stuff that needs to be done in C, or code that needs to have access to "forbidden" structures. (One conscious aspect of the design of Lisp under XEmacs is a clean separation between the external interface to a Lisp object's functionality and its internal implementation. Part of this design is that Lisp programs are forbidden from accessing the contents of the object other than through using a standard API. In this respect, XEmacs Lisp is similar to modern Lisp dialects but differs from GNU Emacs, which tends to expose the implementation and allow Lisp programs to look at it directly. The major advantage of hiding the implementation is that it allows the implementation to be redesigned without affecting any Lisp programs, including those that might want to be "clever" by looking directly at the object's contents and possibly manipulating them.) Moving code into Lisp makes the code easier to debug and maintain and makes it much easier for people who are not XEmacs developers to customize XEmacs, because they can make a change with much less chance of obscure and unwanted interactions occurring than if they were to change the C code.