1 This is Info file ../../info/internals.info, produced by Makeinfo
2 version 1.68 from the input file internals.texi.
4 INFO-DIR-SECTION XEmacs Editor
6 * Internals: (internals). XEmacs Internals Manual.
9 Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun
10 Microsystems. Copyright (C) 1994 - 1998 Free Software Foundation.
11 Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
13 Permission is granted to make and distribute verbatim copies of this
14 manual provided the copyright notice and this permission notice are
15 preserved on all copies.
17 Permission is granted to copy and distribute modified versions of
18 this manual under the conditions for verbatim copying, provided that the
19 entire resulting derived work is distributed under the terms of a
20 permission notice identical to this one.
22 Permission is granted to copy and distribute translations of this
23 manual into another language, under the above conditions for modified
24 versions, except that this permission notice may be stated in a
25 translation approved by the Foundation.
27 Permission is granted to copy and distribute modified versions of
28 this manual under the conditions for verbatim copying, provided also
29 that the section entitled "GNU General Public License" is included
30 exactly as in the original, and provided that the entire resulting
31 derived work is distributed under the terms of a permission notice
32 identical to this one.
34 Permission is granted to copy and distribute translations of this
35 manual into another language, under the above conditions for modified
36 versions, except that the section entitled "GNU General Public License"
37 may be included in a translation approved by the Free Software
38 Foundation instead of in the original English.
41 File: internals.info, Node: Working With Character and Byte Positions, Next: Conversion to and from External Data, Prev: Character-Related Data Types, Up: Coding for Mule
43 Working With Character and Byte Positions
44 -----------------------------------------
46 Now that we have defined the basic character-related types, we can
47 look at the macros and functions designed for work with them and for
48 conversion between them. Most of these macros are defined in
49 `buffer.h', and we don't discuss all of them here, but only the most
50 important ones. Examining the existing code is the best way to learn
54 This preprocessor constant is the maximum number of buffer bytes
55 per Emacs character, i.e. the byte length of an `Emchar'. It is
56 useful when allocating temporary strings to keep a known number of
57 characters. For instance:
63 /* Allocate place for CCLEN characters. */
64 Bufbyte *buf = (Bufbyte *)alloca (cclen * MAX_EMCHAR_LEN);
67 If you followed the previous section, you can guess that,
68 logically, multiplying a `Charcount' value with `MAX_EMCHAR_LEN'
69 produces a `Bytecount' value.
71 In the current Mule implementation, `MAX_EMCHAR_LEN' equals 4.
72 Without Mule, it is 1.
76 The `charptr_emchar' macro takes a `Bufbyte' pointer and returns
77 the `Emchar' stored at that position. If it were a function, its
80 Emchar charptr_emchar (Bufbyte *p);
82 `set_charptr_emchar' stores an `Emchar' to the specified byte
83 position. It returns the number of bytes stored:
85 Bytecount set_charptr_emchar (Bufbyte *p, Emchar c);
87 It is important to note that `set_charptr_emchar' is safe only for
88 appending a character at the end of a buffer, not for overwriting a
89 character in the middle. This is because the width of characters
90 varies, and `set_charptr_emchar' cannot resize the string if it
91 writes, say, a two-byte character where a single-byte character
94 A typical use of `set_charptr_emchar' can be demonstrated by this
95 example, which copies characters from buffer BUF to a temporary
100 for (pos = beg; pos < end; pos++)
102 Emchar c = BUF_FETCH_CHAR (buf, pos);
103 p += set_charptr_emchar (buf, c);
107 Note how `set_charptr_emchar' is used to store the `Emchar' and
108 increment the counter, at the same time.
112 These two macros increment and decrement a `Bufbyte' pointer,
113 respectively. They will adjust the pointer by the appropriate
114 number of bytes according to the byte length of the character
115 stored there. Both macros assume that the memory address is
116 located at the beginning of a valid character.
118 Without Mule support, `INC_CHARPTR (p)' and `DEC_CHARPTR (p)'
119 simply expand to `p++' and `p--', respectively.
121 `bytecount_to_charcount'
122 Given a pointer to a text string and a length in bytes, return the
123 equivalent length in characters.
125 Charcount bytecount_to_charcount (Bufbyte *p, Bytecount bc);
127 `charcount_to_bytecount'
128 Given a pointer to a text string and a length in characters,
129 return the equivalent length in bytes.
131 Bytecount charcount_to_bytecount (Bufbyte *p, Charcount cc);
134 Return a pointer to the beginning of the character offset CC (in
137 Bufbyte *charptr_n_addr (Bufbyte *p, Charcount cc);
140 File: internals.info, Node: Conversion to and from External Data, Next: General Guidelines for Writing Mule-Aware Code, Prev: Working With Character and Byte Positions, Up: Coding for Mule
142 Conversion to and from External Data
143 ------------------------------------
145 When an external function, such as a C library function, returns a
146 `char' pointer, you should almost never treat it as `Bufbyte'. This is
147 because these returned strings may contain 8bit characters which can be
148 misinterpreted by XEmacs, and cause a crash. Likewise, when exporting
149 a piece of internal text to the outside world, you should always
150 convert it to an appropriate external encoding, lest the internal stuff
151 (such as the infamous \201 characters) leak out.
153 The interface to conversion between the internal and external
154 representations of text are the numerous conversion macros defined in
155 `buffer.h'. Before looking at them, we'll look at the external formats
156 supported by these macros.
158 Currently meaningful formats are `FORMAT_BINARY', `FORMAT_FILENAME',
159 `FORMAT_OS', and `FORMAT_CTEXT'. Here is a description of these.
162 Binary format. This is the simplest format and is what we use in
163 the absence of a more appropriate format. This converts according
164 to the `binary' coding system:
166 a. On input, bytes 0-255 are converted into characters 0-255.
168 b. On output, characters 0-255 are converted into bytes 0-255
169 and other characters are converted into `X'.
172 Format used for filenames. In the original Mule, this is
173 user-definable with the `pathname-coding-system' variable. For
174 the moment, we just use the `binary' coding system.
177 Format used for the external Unix environment--`argv[]', stuff
178 from `getenv()', stuff from the `/etc/passwd' file, etc.
180 Perhaps should be the same as FORMAT_FILENAME.
183 Compound-text format. This is the standard X format used for data
184 stored in properties, selections, and the like. This is an 8-bit
185 no-lock-shift ISO2022 coding system.
187 The macros to convert between these formats and the internal format,
188 and vice versa, follow.
190 `GET_CHARPTR_INT_DATA_ALLOCA'
191 `GET_CHARPTR_EXT_DATA_ALLOCA'
192 These two are the most basic conversion macros.
193 `GET_CHARPTR_INT_DATA_ALLOCA' converts external data to internal
194 format, and `GET_CHARPTR_EXT_DATA_ALLOCA' converts the other way
195 around. The arguments each of these receives are PTR (pointer to
196 the text in external format), LEN (length of texts in bytes), FMT
197 (format of the external text), PTR_OUT (lvalue to which new text
198 should be copied), and LEN_OUT (lvalue which will be assigned the
199 length of the internal text in bytes). The resulting text is
200 stored to a stack-allocated buffer. If the text doesn't need
201 changing, these macros will do nothing, except for setting LEN_OUT.
203 The macros above take many arguments which makes them unwieldy.
204 For this reason, a number of convenience macros are defined with
205 obvious functionality, but accepting less arguments. The general
206 rule is that macros with `INT' in their name convert text to
207 internal Emacs representation, whereas the `EXT' macros convert to
208 external representation.
210 `GET_C_CHARPTR_INT_DATA_ALLOCA'
211 `GET_C_CHARPTR_EXT_DATA_ALLOCA'
212 As their names imply, these macros work on C char pointers, which
213 are zero-terminated, and thus do not need LEN or LEN_OUT
216 `GET_STRING_EXT_DATA_ALLOCA'
217 `GET_C_STRING_EXT_DATA_ALLOCA'
218 These two macros convert a Lisp string into an external
219 representation. The difference between them is that
220 `GET_STRING_EXT_DATA_ALLOCA' stores its output to a generic
221 string, providing LEN_OUT, the length of the resulting external
222 string. On the other hand, `GET_C_STRING_EXT_DATA_ALLOCA' assumes
223 that the caller will be satisfied with output string being
226 Note that for Lisp strings only one conversion direction makes
229 `GET_C_CHARPTR_EXT_BINARY_DATA_ALLOCA'
230 `GET_CHARPTR_EXT_BINARY_DATA_ALLOCA'
231 `GET_STRING_BINARY_DATA_ALLOCA'
232 `GET_C_STRING_BINARY_DATA_ALLOCA'
233 `GET_C_CHARPTR_EXT_FILENAME_DATA_ALLOCA'
235 These macros convert internal text to a specific external
236 representation, with the external format being encoded into the
237 name of the macro. Note that the `GET_STRING_...' and
238 `GET_C_STRING...' macros lack the `EXT' tag, because they only
239 make sense in that direction.
241 `GET_C_CHARPTR_INT_BINARY_DATA_ALLOCA'
242 `GET_CHARPTR_INT_BINARY_DATA_ALLOCA'
243 `GET_C_CHARPTR_INT_FILENAME_DATA_ALLOCA'
245 These macros convert external text of a specific format to its
246 internal representation, with the external format being incoded
247 into the name of the macro.
250 File: internals.info, Node: General Guidelines for Writing Mule-Aware Code, Next: An Example of Mule-Aware Code, Prev: Conversion to and from External Data, Up: Coding for Mule
252 General Guidelines for Writing Mule-Aware Code
253 ----------------------------------------------
255 This section contains some general guidance on how to write
256 Mule-aware code, as well as some pitfalls you should avoid.
258 *Never use `char' and `char *'.*
259 In XEmacs, the use of `char' and `char *' is almost always a
260 mistake. If you want to manipulate an Emacs character from "C",
261 use `Emchar'. If you want to examine a specific octet in the
262 internal format, use `Bufbyte'. If you want a Lisp-visible
263 character, use a `Lisp_Object' and `make_char'. If you want a
264 pointer to move through the internal text, use `Bufbyte *'. Also
265 note that you almost certainly do not need `Emchar *'.
267 *Be careful not to confuse `Charcount', `Bytecount', and `Bufpos'.*
268 The whole point of using different types is to avoid confusion
269 about the use of certain variables. Lest this effect be
270 nullified, you need to be careful about using the right types.
272 *Always convert external data*
273 It is extremely important to always convert external data, because
274 XEmacs can crash if unexpected 8bit sequences are copied to its
275 internal buffers literally.
277 This means that when a system function, such as `readdir', returns
278 a string, you need to convert it using one of the conversion macros
279 described in the previous chapter, before passing it further to
280 Lisp. In the case of `readdir', you would use the
281 `GET_C_CHARPTR_INT_FILENAME_DATA_ALLOCA' macro.
283 Also note that many internal functions, such as `make_string',
284 accept Bufbytes, which removes the need for them to convert the
285 data they receive. This increases efficiency because that way
286 external data needs to be decoded only once, when it is read.
287 After that, it is passed around in internal format.
290 File: internals.info, Node: An Example of Mule-Aware Code, Prev: General Guidelines for Writing Mule-Aware Code, Up: Coding for Mule
292 An Example of Mule-Aware Code
293 -----------------------------
295 As an example of Mule-aware code, we shall will analyze the `string'
296 function, which conses up a Lisp string from the character arguments it
297 receives. Here is the definition, pasted from `alloc.c':
299 DEFUN ("string", Fstring, 0, MANY, 0, /*
300 Concatenate all the argument characters and make the result a string.
302 (int nargs, Lisp_Object *args))
304 Bufbyte *storage = alloca_array (Bufbyte, nargs * MAX_EMCHAR_LEN);
305 Bufbyte *p = storage;
307 for (; nargs; nargs--, args++)
309 Lisp_Object lisp_char = *args;
310 CHECK_CHAR_COERCE_INT (lisp_char);
311 p += set_charptr_emchar (p, XCHAR (lisp_char));
313 return make_string (storage, p - storage);
316 Now we can analyze the source line by line.
318 Obviously, string will be as long as there are arguments to the
319 function. This is why we allocate `MAX_EMCHAR_LEN' * NARGS bytes on
320 the stack, i.e. the worst-case number of bytes for NARGS `Emchar's to
323 Then, the loop checks that each element is a character, converting
324 integers in the process. Like many other functions in XEmacs, this
325 function silently accepts integers where characters are expected, for
326 historical and compatibility reasons. Unless you know what you are
327 doing, `CHECK_CHAR' will also suffice. `XCHAR (lisp_char)' extracts
328 the `Emchar' from the `Lisp_Object', and `set_charptr_emchar' stores it
329 to storage, increasing `p' in the process.
331 Other instructive examples of correct coding under Mule can be found
332 all over the XEmacs code. For starters, I recommend
333 `Fnormalize_menu_item_name' in `menubar.c'. After you have understood
334 this section of the manual and studied the examples, you can proceed
335 writing new Mule-aware code.
338 File: internals.info, Node: Techniques for XEmacs Developers, Prev: Coding for Mule, Up: Rules When Writing New C Code
340 Techniques for XEmacs Developers
341 ================================
343 To make a quantified XEmacs, do: `make quantmacs'.
345 You simply can't dump Quantified and Purified images. Run the image
346 like so: `quantmacs -batch -l loadup.el run-temacs XEMACS-ARGS...'.
348 Before you go through the trouble, are you compiling with all
349 debugging and error-checking off? If not try that first. Be warned
350 that while Quantify is directly responsible for quite a few
351 optimizations which have been made to XEmacs, doing a run which
352 generates results which can be acted upon is not necessarily a trivial
355 Also, if you're still willing to do some runs make sure you configure
356 with the `--quantify' flag. That will keep Quantify from starting to
357 record data until after the loadup is completed and will shut off
358 recording right before it shuts down (which generates enough bogus data
359 to throw most results off). It also enables three additional elisp
360 commands: `quantify-start-recording-data',
361 `quantify-stop-recording-data' and `quantify-clear-data'.
363 If you want to make XEmacs faster, target your favorite slow
364 benchmark, run a profiler like Quantify, `gprof', or `tcov', and figure
365 out where the cycles are going. Specific projects:
367 * Make the garbage collector faster. Figure out how to write an
368 incremental garbage collector.
370 * Write a compiler that takes bytecode and spits out C code.
371 Unfortunately, you will then need a C compiler and a more fully
372 developed module system.
374 * Speed up redisplay.
376 * Speed up syntax highlighting. Maybe moving some of the syntax
377 highlighting capabilities into C would make a difference.
379 * Implement tail recursion in Emacs Lisp (hard!).
381 Unfortunately, Emacs Lisp is slow, and is going to stay slow.
382 Function calls in elisp are especially expensive. Iterating over a
383 long list is going to be 30 times faster implemented in C than in Elisp.
385 To get started debugging XEmacs, take a look at the `gdbinit' and
386 `dbxrc' files in the `src' directory. *Note Q2.1.15 - How to Debug an
387 XEmacs problem with a debugger: (xemacs-faq)Q2.1.15 - How to Debug an
388 XEmacs problem with a debugger.
390 After making source code changes, run `make check' to ensure that
391 you haven't introduced any regressions. If you're feeling ambitious,
392 you can try to improve the test suite in `tests/automated'.
394 Here are things to know when you create a new source file:
396 * All `.c' files should `#include <config.h>' first. Almost all
397 `.c' files should `#include "lisp.h"' second.
399 * Generated header files should be included using the `#include
400 <...>' syntax, not the `#include "..."' syntax. The generated
403 `config.h puresize-adjust.h sheap-adjust.h paths.h Emacs.ad.h'
405 The basic rule is that you should assume builds using `--srcdir'
406 and the `#include <...>' syntax needs to be used when the
407 to-be-included generated file is in a potentially different
408 directory *at compile time*. The non-obvious C rule is that
409 `#include "..."' means to search for the included file in the
410 same directory as the including file, *not* in the current
413 * Header files should *not* include `<config.h>' and `"lisp.h"'. It
414 is the responsibility of the `.c' files that use it to do so.
416 * If the header uses `INLINE', either directly or through
417 `DECLARE_LRECORD', then it must be added to `inline.c''s includes.
419 * Try compiling at least once with
421 gcc --with-mule --with-union-type --error-checking=all
423 * Did I mention that you should run the test suite?
427 File: internals.info, Node: A Summary of the Various XEmacs Modules, Next: Allocation of Objects in XEmacs Lisp, Prev: Rules When Writing New C Code, Up: Top
429 A Summary of the Various XEmacs Modules
430 ***************************************
432 This is accurate as of XEmacs 20.0.
436 * Low-Level Modules::
437 * Basic Lisp Modules::
438 * Modules for Standard Editing Operations::
439 * Editor-Level Control Flow Modules::
440 * Modules for the Basic Displayable Lisp Objects::
441 * Modules for other Display-Related Lisp Objects::
442 * Modules for the Redisplay Mechanism::
443 * Modules for Interfacing with the File System::
444 * Modules for Other Aspects of the Lisp Interpreter and Object System::
445 * Modules for Interfacing with the Operating System::
446 * Modules for Interfacing with X Windows::
447 * Modules for Internationalization::
450 File: internals.info, Node: Low-Level Modules, Next: Basic Lisp Modules, Up: A Summary of the Various XEmacs Modules
457 This is automatically generated from `config.h.in' based on the
458 results of configure tests and user-selected optional features and
459 contains preprocessor definitions specifying the nature of the
460 environment in which XEmacs is being compiled.
464 This is automatically generated from `paths.h.in' based on supplied
465 configure values, and allows for non-standard installed configurations
466 of the XEmacs directories. It's currently broken, though.
471 `emacs.c' contains `main()' and other code that performs the most
472 basic environment initializations and handles shutting down the XEmacs
473 process (this includes `kill-emacs', the normal way that XEmacs is
474 exited; `dump-emacs', which is used during the build process to write
475 out the XEmacs executable; `run-emacs-from-temacs', which can be used
476 to start XEmacs directly when temacs has finished loading all the Lisp
477 code; and emergency code to handle crashes [XEmacs tries to auto-save
478 all files before it crashes]).
480 Low-level code that directly interacts with the Unix signal
481 mechanism, however, is in `signal.c'. Note that this code does not
482 handle system dependencies in interfacing to signals; that is handled
483 using the `syssignal.h' header file, described in section J below.
503 These modules contain code dumping out the XEmacs executable on
504 various different systems. (This process is highly machine-specific and
505 requires intimate knowledge of the executable format and the memory map
506 of the process.) Only one of these modules is actually used; this is
507 chosen by `configure'.
513 These modules are used in conjunction with the dump mechanism. On
514 some systems, an alternative version of the C startup code (the actual
515 code that receives control from the operating system when the process is
516 started, and which calls `main()') is required so that the dumping
517 process works properly; `crt0.c' provides this.
519 `pre-crt0.c' and `lastfile.c' should be the very first and very last
520 file linked, respectively. (Actually, this is not really true.
521 `lastfile.c' should be after all Emacs modules whose initialized data
522 should be made constant, and before all other Emacs files and all
523 libraries. In particular, the allocation modules `gmalloc.c',
524 `alloca.c', etc. are normally placed past `lastfile.c', and all of the
525 files that implement Xt widget classes *must* be placed after
526 `lastfile.c' because they contain various structures that must be
527 statically initialized and into which Xt writes at various times.)
528 `pre-crt0.c' and `lastfile.c' contain exported symbols that are used to
529 determine the start and end of XEmacs' initialized data space when
541 These handle basic C allocation of memory. `alloca.c' is an
542 emulation of the stack allocation function `alloca()' on machines that
543 lack this. (XEmacs makes extensive use of `alloca()' in its code.)
545 `gmalloc.c' and `malloc.c' are two implementations of the standard C
546 functions `malloc()', `realloc()' and `free()'. They are often used in
547 place of the standard system-provided `malloc()' because they usually
548 provide a much faster implementation, at the expense of additional
549 memory use. `gmalloc.c' is a newer implementation that is much more
550 memory-efficient for large allocations than `malloc.c', and should
551 always be preferred if it works. (At one point, `gmalloc.c' didn't work
552 on some systems where `malloc.c' worked; but this should be fixed now.)
554 `ralloc.c' is the "relocating allocator". It provides functions
555 similar to `malloc()', `realloc()' and `free()' that allocate memory
556 that can be dynamically relocated in memory. The advantage of this is
557 that allocated memory can be shuffled around to place all the free
558 memory at the end of the heap, and the heap can then be shrunk,
559 releasing the memory back to the operating system. The use of this can
560 be controlled with the configure option `--rel-alloc'; if enabled,
561 memory allocated for buffers will be relocatable, so that if a very
562 large file is visited and the buffer is later killed, the memory can be
563 released to the operating system. (The disadvantage of this mechanism
564 is that it can be very slow. On systems with the `mmap()' system call,
565 the XEmacs version of `ralloc.c' uses this to move memory around
566 without actually having to block-copy it, which can speed things up;
567 but it can still cause noticeable performance degradation.)
569 `free-hook.c' contains some debugging functions for checking for
570 invalid arguments to `free()'.
572 `vm-limit.c' contains some functions that warn the user when memory
573 is getting low. These are callback functions that are called by
574 `gmalloc.c' and `malloc.c' at appropriate times.
576 `getpagesize.h' provides a uniform interface for retrieving the size
577 of a page in virtual memory. `mem-limits.h' provides a uniform
578 interface for retrieving the total amount of available virtual memory.
579 Both are similar in spirit to the `sys*.h' files described in section
586 These implement a couple of basic C data types to facilitate memory
587 allocation. The `Blocktype' type efficiently manages the allocation of
588 fixed-size blocks by minimizing the number of times that `malloc()' and
589 `free()' are called. It allocates memory in large chunks, subdivides
590 the chunks into blocks of the proper size, and returns the blocks as
591 requested. When blocks are freed, they are placed onto a linked list,
592 so they can be efficiently reused. This data type is not much used in
593 XEmacs currently, because it's a fairly new addition.
595 The `Dynarr' type implements a "dynamic array", which is similar to
596 a standard C array but has no fixed limit on the number of elements it
597 can contain. Dynamic arrays can hold elements of any type, and when
598 you add a new element, the array automatically resizes itself if it
599 isn't big enough. Dynarrs are extensively used in the redisplay
604 This module is used in connection with inline functions (available in
605 some compilers). Often, inline functions need to have a corresponding
606 non-inline function that does the same thing. This module is where they
607 reside. It contains no actual code, but defines some special flags that
608 cause inline functions defined in header files to be rendered as actual
609 functions. It then includes all header files that contain any inline
610 function definitions, so that each one gets a real function equivalent.
615 These functions provide a system for doing internal consistency
616 checks during code development. This system is not currently used;
617 instead the simpler `assert()' macro is used along with the various
618 checks provided by the `--error-check-*' configuration options.
622 This is actually the source for a small, self-contained program used
627 This is not currently used.
630 File: internals.info, Node: Basic Lisp Modules, Next: Modules for Standard Editing Operations, Prev: Low-Level Modules, Up: A Summary of the Various XEmacs Modules
642 These are the basic header files for all XEmacs modules. Each module
643 includes `lisp.h', which brings the other header files in. `lisp.h'
644 contains the definitions of the structures and extractor and
645 constructor macros for the basic Lisp objects and various other basic
646 definitions for the Lisp environment, as well as some general-purpose
647 definitions (e.g. `min()' and `max()'). `lisp.h' includes either
648 `lisp-disunion.h' or `lisp-union.h', depending on whether
649 `USE_UNION_TYPE' is defined. These files define the typedef of the
650 Lisp object itself (as described above) and the low-level macros that
651 hide the actual implementation of the Lisp object. All extractor and
652 constructor macros for particular types of Lisp objects are defined in
653 terms of these low-level macros.
655 As a general rule, all typedefs should go into the typedefs section
656 of `lisp.h' rather than into a module-specific header file even if the
657 structure is defined elsewhere. This allows function prototypes that
658 use the typedef to be placed into other header files. Forward structure
659 declarations (i.e. a simple declaration like `struct foo;' where the
660 structure itself is defined elsewhere) should be placed into the
661 typedefs section as necessary.
663 `lrecord.h' contains the basic structures and macros that implement
664 all record-type Lisp objects - i.e. all objects whose type is a field
665 in their C structure, which includes all objects except the few most
668 `lisp.h' contains prototypes for most of the exported functions in
669 the various modules. Lisp primitives defined using `DEFUN' that need
670 to be called by C code should be declared using `EXFUN'. Other
671 function prototypes should be placed either into the appropriate
672 section of `lisp.h', or into a module-specific header file, depending
673 on how general-purpose the function is and whether it has
674 special-purpose argument types requiring definitions not in `lisp.h'.)
675 All initialization functions are prototyped in `symsinit.h'.
681 The large module `alloc.c' implements all of the basic allocation and
682 garbage collection for Lisp objects. The most commonly used Lisp
683 objects are allocated in chunks, similar to the Blocktype data type
684 described above; others are allocated in individually `malloc()'ed
685 blocks. This module provides the foundation on which all other aspects
686 of the Lisp environment sit, and is the first module initialized at
689 Note that `alloc.c' provides a series of generic functions that are
690 not dependent on any particular object type, and interfaces to
691 particular types of objects using a standardized interface of
692 type-specific methods. This scheme is a fundamental principle of
693 object-oriented programming and is heavily used throughout XEmacs. The
694 great advantage of this is that it allows for a clean separation of
695 functionality into different modules - new classes of Lisp objects, new
696 event interfaces, new device types, new stream interfaces, etc. can be
697 added transparently without affecting code anywhere else in XEmacs.
698 Because the different subsystems are divided into general and specific
699 code, adding a new subtype within a subsystem will in general not
700 require changes to the generic subsystem code or affect any of the other
701 subtypes in the subsystem; this provides a great deal of robustness to
704 `pure.c' contains the declaration of the "purespace" array. Pure
705 space is a hack used to place some constant Lisp data into the code
706 segment of the XEmacs executable, even though the data needs to be
707 initialized through function calls. (See above in section VIII for more
708 info about this.) During startup, certain sorts of data is
709 automatically copied into pure space, and other data is copied manually
710 in some of the basic Lisp files by calling the function `purecopy',
711 which copies the object if possible (this only works in temacs, of
712 course) and returns the new object. In particular, while temacs is
713 executing, the Lisp reader automatically copies all compiled-function
714 objects that it reads into pure space. Since compiled-function objects
715 are large, are never modified, and typically comprise the majority of
716 the contents of a compiled-Lisp file, this works well. While XEmacs is
717 running, any attempt to modify an object that resides in pure space
718 causes an error. Objects in pure space are never garbage collected -
719 almost all of the time, they're intended to be permanent, and in any
720 case you can't write into pure space to set the mark bits.
722 `puresize.h' contains the declaration of the size of the pure space
723 array. This depends on the optional features that are compiled in, any
724 extra purespace requested by the user at compile time, and certain other
725 factors (e.g. 64-bit machines need more pure space because their Lisp
726 objects are larger). The smallest size that suffices should be used, so
727 that there's no wasted space. If there's not enough pure space, you
728 will get an error during the build process, specifying how much more
729 pure space is needed.
734 This module contains all of the functions to handle the flow of
735 control. This includes the mechanisms of defining functions, calling
736 functions, traversing stack frames, and binding variables; the control
737 primitives and other special forms such as `while', `if', `eval',
738 `let', `and', `or', `progn', etc.; handling of non-local exits,
739 unwind-protects, and exception handlers; entering the debugger; methods
740 for the subr Lisp object type; etc. It does *not* include the `read'
741 function, the `print' function, or the handling of symbols and obarrays.
743 `backtrace.h' contains some structures related to stack frames and
748 This module implements the Lisp reader and the `read' function,
749 which converts text into Lisp objects, according to the read syntax of
750 the objects, as described above. This is similar to the parser that is
751 a part of all compilers.
755 This module implements the Lisp print mechanism and the `print'
756 function and related functions. This is the inverse of the Lisp reader
757 - it converts Lisp objects to a printed, textual representation.
758 (Hopefully something that can be read back in using `read' to get an
765 `symbols.c' implements the handling of symbols, obarrays, and
766 retrieving the values of symbols. Much of the code is devoted to
767 handling the special "symbol-value-magic" objects that define special
768 types of variables - this includes buffer-local variables, variable
769 aliases, variables that forward into C variables, etc. This module is
770 initialized extremely early (right after `alloc.c'), because it is here
771 that the basic symbols `t' and `nil' are created, and those symbols are
772 used everywhere throughout XEmacs.
774 `symeval.h' contains the definitions of symbol structures and the
775 `DEFVAR_LISP()' and related macros for declaring variables.
781 These modules implement the methods and standard Lisp primitives for
782 all the basic Lisp object types other than symbols (which are described
783 above). `data.c' contains all the predicates (primitives that return
784 whether an object is of a particular type); the integer arithmetic
785 functions; and the basic accessor and mutator primitives for the various
786 object types. `fns.c' contains all the standard predicates for working
787 with sequences (where, abstractly speaking, a sequence is an ordered set
788 of objects, and can be represented by a list, string, vector, or
789 bit-vector); it also contains `equal', perhaps on the grounds that bulk
790 of the operation of `equal' is comparing sequences. `floatfns.c'
791 contains methods and primitives for floats and floating-point
797 `bytecode.c' implements the byte-code interpreter and
798 compiled-function objects, and `bytecode.h' contains associated
799 structures. Note that the byte-code *compiler* is written in Lisp.
802 File: internals.info, Node: Modules for Standard Editing Operations, Next: Editor-Level Control Flow Modules, Prev: Basic Lisp Modules, Up: A Summary of the Various XEmacs Modules
804 Modules for Standard Editing Operations
805 =======================================
811 `buffer.c' implements the "buffer" Lisp object type. This includes
812 functions that create and destroy buffers; retrieve buffers by name or
813 by other properties; manipulate lists of buffers (remember that buffers
814 are permanent objects and stored in various ordered lists); retrieve or
815 change buffer properties; etc. It also contains the definitions of all
816 the built-in buffer-local variables (which can be viewed as buffer
817 properties). It does *not* contain code to manipulate buffer-local
818 variables (that's in `symbols.c', described above); or code to
819 manipulate the text in a buffer.
821 `buffer.h' defines the structures associated with a buffer and the
822 various macros for retrieving text from a buffer and special buffer
823 positions (e.g. `point', the default location for text insertion). It
824 also contains macros for working with buffer positions and converting
825 between their representations as character offsets and as byte offsets
826 (under MULE, they are different, because characters can be multi-byte).
827 It is one of the largest header files.
829 `bufslots.h' defines the fields in the buffer structure that
830 correspond to the built-in buffer-local variables. It is its own
831 header file because it is included many times in `buffer.c', as a way
832 of iterating over all the built-in buffer-local variables.
837 `insdel.c' contains low-level functions for inserting and deleting
838 text in a buffer, keeping track of changed regions for use by
839 redisplay, and calling any before-change and after-change functions
840 that may have been registered for the buffer. It also contains the
841 actual functions that convert between byte offsets and character
844 `insdel.h' contains associated headers.
848 This module implements the "marker" Lisp object type, which
849 conceptually is a pointer to a text position in a buffer that moves
850 around as text is inserted and deleted, so as to remain in the same
851 relative position. This module doesn't actually move the markers around
852 - that's handled in `insdel.c'. This module just creates them and
853 implements the primitives for working with them. As markers are simple
854 objects, this does not entail much.
856 Note that the standard arithmetic primitives (e.g. `+') accept
857 markers in place of integers and automatically substitute the value of
858 `marker-position' for the marker, i.e. an integer describing the
859 current buffer position of the marker.
864 This module implements the "extent" Lisp object type, which is like
865 a marker that works over a range of text rather than a single position.
866 Extents are also much more complex and powerful than markers and have a
867 more efficient (and more algorithmically complex) implementation. The
868 implementation is described in detail in comments in `extents.c'.
870 The code in `extents.c' works closely with `insdel.c' so that
871 extents are properly moved around as text is inserted and deleted.
872 There is also code in `extents.c' that provides information needed by
873 the redisplay mechanism for efficient operation. (Remember that extents
874 can have display properties that affect [sometimes drastically, as in
875 the `invisible' property] the display of the text they cover.)
879 `editfns.c' contains the standard Lisp primitives for working with a
880 buffer's text, and calls the low-level functions in `insdel.c'. It
881 also contains primitives for working with `point' (the default buffer
884 `editfns.c' also contains functions for retrieving various
885 characteristics from the external environment: the current time, the
886 process ID of the running XEmacs process, the name of the user who ran
887 this XEmacs process, etc. It's not clear why this code is in
894 These modules implement the basic "interactive" commands, i.e.
895 user-callable functions. Commands, as opposed to other functions, have
896 special ways of getting their parameters interactively (by querying the
897 user), as opposed to having them passed in a normal function
898 invocation. Many commands are not really meant to be called from other
899 Lisp functions, because they modify global state in a way that's often
900 undesired as part of other Lisp functions.
902 `callint.c' implements the mechanism for querying the user for
903 parameters and calling interactive commands. The bulk of this module is
904 code that parses the interactive spec that is supplied with an
907 `cmds.c' implements the basic, most commonly used editing commands:
908 commands to move around the current buffer and insert and delete
909 characters. These commands are implemented using the Lisp primitives
910 defined in `editfns.c'.
912 `commands.h' contains associated structure definitions and
919 `search.c' implements the Lisp primitives for searching for text in
920 a buffer, and some of the low-level algorithms for doing this. In
921 particular, the fast fixed-string Boyer-Moore search algorithm is
922 implemented in `search.c'. The low-level algorithms for doing
923 regular-expression searching, however, are implemented in `regex.c' and
924 `regex.h'. These two modules are largely independent of XEmacs, and
925 are similar to (and based upon) the regular-expression routines used in
926 `grep' and other GNU utilities.
930 `doprnt.c' implements formatted-string processing, similar to
931 `printf()' command in C.
935 This module implements the undo mechanism for tracking buffer
936 changes. Most of this could be implemented in Lisp.
939 File: internals.info, Node: Editor-Level Control Flow Modules, Next: Modules for the Basic Displayable Lisp Objects, Prev: Modules for Standard Editing Operations, Up: A Summary of the Various XEmacs Modules
941 Editor-Level Control Flow Modules
942 =================================
950 These implement the handling of events (user input and other system
953 `events.c' and `events.h' define the "event" Lisp object type and
954 primitives for manipulating it.
956 `event-stream.c' implements the basic functions for working with
957 event queues, dispatching an event by looking it up in relevant keymaps
958 and such, and handling timeouts; this includes the primitives
959 `next-event' and `dispatch-event', as well as related primitives such
960 as `sit-for', `sleep-for', and `accept-process-output'.
961 (`event-stream.c' is one of the hairiest and trickiest modules in
962 XEmacs. Beware! You can easily mess things up here.)
964 `event-Xt.c' and `event-tty.c' implement the low-level interfaces
965 onto retrieving events from Xt (the X toolkit) and from TTY's (using
966 `read()' and `select()'), respectively. The event interface enforces a
967 clean separation between the specific code for interfacing with the
968 operating system and the generic code for working with events, by
969 defining an API of basic, low-level event methods; `event-Xt.c' and
970 `event-tty.c' are two different implementations of this API. To add
971 support for a new operating system (e.g. NeXTstep), one merely needs to
972 provide another implementation of those API functions.
974 Note that the choice of whether to use `event-Xt.c' or `event-tty.c'
975 is made at compile time! Or at the very latest, it is made at startup
976 time. `event-Xt.c' handles events for *both* X and TTY frames;
977 `event-tty.c' is only used when X support is not compiled into XEmacs.
978 The reason for this is that there is only one event loop in XEmacs:
979 thus, it needs to be able to receive events from all different kinds of
985 `keymap.c' and `keymap.h' define the "keymap" Lisp object type and
986 associated methods and primitives. (Remember that keymaps are objects
987 that associate event descriptions with functions to be called to
988 "execute" those events; `dispatch-event' looks up events in the
993 `keyboard.c' contains functions that implement the actual editor
994 command loop - i.e. the event loop that cyclically retrieves and
995 dispatches events. This code is also rather tricky, just like
1001 These two modules contain the basic code for defining keyboard
1002 macros. These functions don't actually do much; most of the code that
1003 handles keyboard macros is mixed in with the event-handling code in
1008 This contains some miscellaneous code related to the minibuffer
1009 (most of the minibuffer code was moved into Lisp by Richard Mlynarik).
1010 This includes the primitives for completion (although filename
1011 completion is in `dired.c'), the lowest-level interface to the
1012 minibuffer (if the command loop were cleaned up, this too could be in
1013 Lisp), and code for dealing with the echo area (this, too, was mostly
1014 moved into Lisp, and the only code remaining is code to call out to
1015 Lisp or provide simple bootstrapping implementations early in temacs,
1016 before the echo-area Lisp code is loaded).
1019 File: internals.info, Node: Modules for the Basic Displayable Lisp Objects, Next: Modules for other Display-Related Lisp Objects, Prev: Editor-Level Control Flow Modules, Up: A Summary of the Various XEmacs Modules
1021 Modules for the Basic Displayable Lisp Objects
1022 ==============================================
1034 These modules implement the "device" Lisp object type. This
1035 abstracts a particular screen or connection on which frames are
1036 displayed. As with Lisp objects, event interfaces, and other
1037 subsystems, the device code is separated into a generic component that
1038 contains a standardized interface (in the form of a set of methods) onto
1039 particular device types.
1041 The device subsystem defines all the methods and provides method
1042 services for not only device operations but also for the frame, window,
1043 menubar, scrollbar, toolbar, and other displayable-object subsystems.
1044 The reason for this is that all of these subsystems have the same
1045 subtypes (X, TTY, NeXTstep, Microsoft Windows, etc.) as devices do.
1054 Each device contains one or more frames in which objects (e.g. text)
1055 are displayed. A frame corresponds to a window in the window system;
1056 usually this is a top-level window but it could potentially be one of a
1057 number of overlapping child windows within a top-level window, using the
1058 MDI (Multiple Document Interface) protocol in Microsoft Windows or a
1061 The `frame-*' files implement the "frame" Lisp object type and
1062 provide the generic and device-type-specific operations on frames (e.g.
1063 raising, lowering, resizing, moving, etc.).
1068 Each frame consists of one or more non-overlapping "windows" (better
1069 known as "panes" in standard window-system terminology) in which a
1070 buffer's text can be displayed. Windows can also have scrollbars
1071 displayed around their edges.
1073 `window.c' and `window.h' implement the "window" Lisp object type
1074 and provide code to manage windows. Since windows have no associated
1075 resources in the window system (the window system knows only about the
1076 frame; no child windows or anything are used for XEmacs windows), there
1077 is no device-type-specific code here; all of that code is part of the
1078 redisplay mechanism or the code for particular object types such as
1082 File: internals.info, Node: Modules for other Display-Related Lisp Objects, Next: Modules for the Redisplay Mechanism, Prev: Modules for the Basic Displayable Lisp Objects, Up: A Summary of the Various XEmacs Modules
1084 Modules for other Display-Related Lisp Objects
1085 ==============================================
1119 This file provides C support for syntax highlighting - i.e.
1120 highlighting different syntactic constructs of a source file in
1121 different colors, for easy reading. The C support is provided so that
1129 These modules decode GIF-format image files, for use with glyphs.
1132 File: internals.info, Node: Modules for the Redisplay Mechanism, Next: Modules for Interfacing with the File System, Prev: Modules for other Display-Related Lisp Objects, Up: A Summary of the Various XEmacs Modules
1134 Modules for the Redisplay Mechanism
1135 ===================================
1143 These files provide the redisplay mechanism. As with many other
1144 subsystems in XEmacs, there is a clean separation between the general
1145 and device-specific support.
1147 `redisplay.c' contains the bulk of the redisplay engine. These
1148 functions update the redisplay structures (which describe how the screen
1149 is to appear) to reflect any changes made to the state of any
1150 displayable objects (buffer, frame, window, etc.) since the last time
1151 that redisplay was called. These functions are highly optimized to
1152 avoid doing more work than necessary (since redisplay is called
1153 extremely often and is potentially a huge time sink), and depend heavily
1154 on notifications from the objects themselves that changes have occurred,
1155 so that redisplay doesn't explicitly have to check each possible object.
1156 The redisplay mechanism also contains a great deal of caching to further
1157 speed things up; some of this caching is contained within the various
1158 displayable objects.
1160 `redisplay-output.c' goes through the redisplay structures and
1161 converts them into calls to device-specific methods to actually output
1164 `redisplay-x.c' and `redisplay-tty.c' are two implementations of
1165 these redisplay output methods, for X frames and TTY frames,
1170 This module contains various functions and Lisp primitives for
1171 converting between buffer positions and screen positions. These
1172 functions call the redisplay mechanism to do most of the work, and then
1173 examine the redisplay structures to get the necessary information. This
1180 These files contain functions for working with the termcap
1181 (BSD-style) and terminfo (System V style) databases of terminal
1182 capabilities and escape sequences, used when XEmacs is displaying in a
1188 These files provide some miscellaneous TTY-output functions and
1189 should probably be merged into `redisplay-tty.c'.