1 This is Info file ../../info/internals.info, produced by Makeinfo
2 version 1.68 from the input file internals.texi.
4 INFO-DIR-SECTION XEmacs Editor
6 * Internals: (internals). XEmacs Internals Manual.
9 Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun
10 Microsystems. Copyright (C) 1994 - 1998 Free Software Foundation.
11 Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
13 Permission is granted to make and distribute verbatim copies of this
14 manual provided the copyright notice and this permission notice are
15 preserved on all copies.
17 Permission is granted to copy and distribute modified versions of
18 this manual under the conditions for verbatim copying, provided that the
19 entire resulting derived work is distributed under the terms of a
20 permission notice identical to this one.
22 Permission is granted to copy and distribute translations of this
23 manual into another language, under the above conditions for modified
24 versions, except that this permission notice may be stated in a
25 translation approved by the Foundation.
27 Permission is granted to copy and distribute modified versions of
28 this manual under the conditions for verbatim copying, provided also
29 that the section entitled "GNU General Public License" is included
30 exactly as in the original, and provided that the entire resulting
31 derived work is distributed under the terms of a permission notice
32 identical to this one.
34 Permission is granted to copy and distribute translations of this
35 manual into another language, under the above conditions for modified
36 versions, except that the section entitled "GNU General Public License"
37 may be included in a translation approved by the Free Software
38 Foundation instead of in the original English.
41 File: internals.info, Node: Modules for Interfacing with the File System, Next: Modules for Other Aspects of the Lisp Interpreter and Object System, Prev: Modules for the Redisplay Mechanism, Up: A Summary of the Various XEmacs Modules
43 Modules for Interfacing with the File System
44 ============================================
49 These modules implement the "stream" Lisp object type. This is an
50 internal-only Lisp object that implements a generic buffering stream.
51 The idea is to provide a uniform interface onto all sources and sinks of
52 data, including file descriptors, stdio streams, chunks of memory, Lisp
53 buffers, Lisp strings, etc. That way, I/O functions can be written to
54 the stream interface and can transparently handle all possible sources
55 and sinks. (For example, the `read' function can read data from a
56 file, a string, a buffer, or even a function that is called repeatedly
57 to return data, without worrying about where the data is coming from or
58 what-size chunks it is returned in.)
60 Note that in the C code, streams are called "lstreams" (for "Lisp
61 streams") to distinguish them from other kinds of streams, e.g. stdio
62 streams and C++ I/O streams.
64 Similar to other subsystems in XEmacs, lstreams are separated into
65 generic functions and a set of methods for the different types of
66 lstreams. `lstream.c' provides implementations of many different types
67 of streams; others are provided, e.g., in `mule-coding.c'.
71 This implements the basic primitives for interfacing with the file
72 system. This includes primitives for reading files into buffers,
73 writing buffers into files, checking for the presence or accessibility
74 of files, canonicalizing file names, etc. Note that these primitives
75 are usually not invoked directly by the user: There is a great deal of
76 higher-level Lisp code that implements the user commands such as
77 `find-file' and `save-buffer'. This is similar to the distinction
78 between the lower-level primitives in `editfns.c' and the higher-level
79 user commands in `commands.c' and `simple.el'.
83 This file provides functions for detecting clashes between different
84 processes (e.g. XEmacs and some external process, or two different
85 XEmacs processes) modifying the same file. (XEmacs can optionally use
86 the `lock/' subdirectory to provide a form of "locking" between
87 different XEmacs processes.) This module is also used by the low-level
88 functions in `insdel.c' to ensure that, if the first modification is
89 being made to a buffer whose corresponding file has been externally
90 modified, the user is made aware of this so that the buffer can be
91 synched up with the external changes if necessary.
95 This file provides some miscellaneous functions that construct a
96 `rwxr-xr-x'-type permissions string (as might appear in an `ls'-style
97 directory listing) given the information returned by the `stat()'
103 These files implement the XEmacs interface to directory searching.
104 This includes a number of primitives for determining the files in a
105 directory and for doing filename completion. (Remember that generic
106 completion is handled by a different mechanism, in `minibuf.c'.)
108 `ndir.h' is a header file used for the directory-searching emulation
109 functions provided in `sysdep.c' (see section J below), for systems
110 that don't provide any directory-searching functions. (On those
111 systems, directories can be read directly as files, and parsed.)
115 This file provides an implementation of the `realpath()' function
116 for expanding symbolic links, on systems that don't implement it or have
117 a broken implementation.
120 File: internals.info, Node: Modules for Other Aspects of the Lisp Interpreter and Object System, Next: Modules for Interfacing with the Operating System, Prev: Modules for Interfacing with the File System, Up: A Summary of the Various XEmacs Modules
122 Modules for Other Aspects of the Lisp Interpreter and Object System
123 ===================================================================
130 These files provide two implementations of hash tables. Files
131 `hash.c' and `hash.h' provide a generic C implementation of hash tables
132 which can stand independently of XEmacs. Files `elhash.c' and
133 `elhash.h' provide a separate implementation of hash tables that can
134 store only Lisp objects, and knows about Lispy things like garbage
135 collection, and implement the "hash-table" Lisp object type.
140 This module implements the "specifier" Lisp object type. This is
141 primarily used for displayable properties, and allows for values that
142 are specific to a particular buffer, window, frame, device, or device
143 class, as well as a default value existing. This is used, for example,
144 to control the height of the horizontal scrollbar or the appearance of
145 the `default', `bold', or other faces. The specifier object consists
146 of a number of specifications, each of which maps from a buffer,
147 window, etc. to a value. The function `specifier-instance' looks up a
148 value given a window (from which a buffer, frame, and device can be
155 `chartab.c' and `chartab.h' implement the "char table" Lisp object
156 type, which maps from characters or certain sorts of character ranges
157 to Lisp objects. The implementation of this object type is optimized
158 for the internal representation of characters. Char tables come in
159 different types, which affect the allowed object types to which a
160 character can be mapped and also dictate certain other properties of
163 `casetab.c' implements one sort of char table, the "case table",
164 which maps characters to other characters of possibly different case.
165 These are used by XEmacs to implement case-changing primitives and to
166 do case-insensitive searching.
171 This module implements "syntax tables", another sort of char table
172 that maps characters into syntax classes that define the syntax of these
173 characters (e.g. a parenthesis belongs to a class of `open' characters
174 that have corresponding `close' characters and can be nested). This
175 module also implements the Lisp "scanner", a set of primitives for
176 scanning over text based on syntax tables. This is used, for example,
177 to find the matching parenthesis in a command such as `forward-sexp',
178 and by `font-lock.c' to locate quoted strings, comments, etc.
182 This module implements various Lisp primitives for upcasing,
183 downcasing and capitalizing strings or regions of buffers.
187 This module implements the "range table" Lisp object type, which
188 provides for a mapping from ranges of integers to arbitrary Lisp
194 This module implements the "opaque" Lisp object type, an
195 internal-only Lisp object that encapsulates an arbitrary block of memory
196 so that it can be managed by the Lisp allocation system. To create an
197 opaque object, you call `make_opaque()', passing a pointer to a block
198 of memory. An object is created that is big enough to hold the memory,
199 which is copied into the object's storage. The object will then stick
200 around as long as you keep pointers to it, after which it will be
201 automatically reclaimed.
203 Opaque objects can also have an arbitrary "mark method" associated
204 with them, in case the block of memory contains other Lisp objects that
205 need to be marked for garbage-collection purposes. (If you need other
206 object methods, such as a finalize method, you should just go ahead and
207 create a new Lisp object type - it's not hard.)
211 This function provides a few primitives for doing dynamic
212 abbreviation expansion. In XEmacs, most of the code for this has been
213 moved into Lisp. Some C code remains for speed and because the
214 primitive `self-insert-command' (which is executed for all
215 self-inserting characters) hooks into the abbrev mechanism.
216 (`self-insert-command' is itself in C only for speed.)
220 This function provides primitives for retrieving the documentation
221 strings of functions and variables. These documentation strings contain
222 certain special markers that get dynamically expanded (e.g. a
223 reverse-lookup is performed on some named functions to retrieve their
224 current key bindings). Some documentation strings (in particular, for
225 the built-in primitives and pre-loaded Lisp functions) are stored
226 externally in a file `DOC' in the `lib-src/' directory and need to be
227 fetched from that file. (Part of the build stage involves building this
228 file, and another part involves constructing an index for this file and
229 embedding it into the executable, so that the functions in `doc.c' do
230 not have to search the entire `DOC' file to find the appropriate
231 documentation string.)
235 This function provides a Lisp primitive that implements the MD5
236 secure hashing scheme, used to create a large hash value of a string of
237 data such that the data cannot be derived from the hash value. This is
238 used for various security applications on the Internet.
241 File: internals.info, Node: Modules for Interfacing with the Operating System, Next: Modules for Interfacing with X Windows, Prev: Modules for Other Aspects of the Lisp Interpreter and Object System, Up: A Summary of the Various XEmacs Modules
243 Modules for Interfacing with the Operating System
244 =================================================
250 These modules allow XEmacs to spawn and communicate with subprocesses
251 and network connections.
253 `callproc.c' implements (through the `call-process' primitive) what
254 are called "synchronous subprocesses". This means that XEmacs runs a
255 program, waits till it's done, and retrieves its output. A typical
256 example might be calling the `ls' program to get a directory listing.
258 `process.c' and `process.h' implement "asynchronous subprocesses".
259 This means that XEmacs starts a program and then continues normally,
260 not waiting for the process to finish. Data can be sent to the process
261 or retrieved from it as it's running. This is used for the `shell'
262 command (which provides a front end onto a shell program such as
263 `csh'), the mail and news readers implemented in XEmacs, etc. The
264 result of calling `start-process' to start a subprocess is a process
265 object, a particular kind of object used to communicate with the
266 subprocess. You can send data to the process by passing the process
267 object and the data to `send-process', and you can specify what happens
268 to data retrieved from the process by setting properties of the process
269 object. (When the process sends data, XEmacs receives a process event,
270 which says that there is data ready. When `dispatch-event' is called
271 on this event, it reads the data from the process and does something
272 with it, as specified by the process object's properties. Typically,
273 this means inserting the data into a buffer or calling a function.)
274 Another property of the process object is called the "sentinel", which
275 is a function that is called when the process terminates.
277 Process objects are also used for network connections (connections
278 to a process running on another machine). Network connections are
279 started with `open-network-stream' but otherwise work just like
285 These modules implement most of the low-level, messy operating-system
286 interface code. This includes various device control (ioctl) operations
287 for file descriptors, TTY's, pseudo-terminals, etc. (usually this stuff
288 is fairly system-dependent; thus the name of this module), and emulation
289 of standard library functions and system calls on systems that don't
290 provide them or have broken versions.
302 These header files provide consistent interfaces onto
303 system-dependent header files and system calls. The idea is that,
304 instead of including a standard header file like `<sys/param.h>' (which
305 may or may not exist on various systems) or having to worry about
306 whether all system provide a particular preprocessor constant, or
307 having to deal with the four different paradigms for manipulating
308 signals, you just include the appropriate `sys*.h' header file, which
309 includes all the right system header files, defines and missing
310 preprocessor constants, provides a uniform interface onto system calls,
313 `sysdir.h' provides a uniform interface onto directory-querying
314 functions. (In some cases, this is in conjunction with emulation
315 functions in `sysdep.c'.)
317 `sysfile.h' includes all the necessary header files for standard
318 system calls (e.g. `read()'), ensures that all necessary `open()' and
319 `stat()' preprocessor constants are defined, and possibly (usually)
320 substitutes sugared versions of `read()', `write()', etc. that
321 automatically restart interrupted I/O operations.
323 `sysfloat.h' includes the necessary header files for floating-point
326 `sysproc.h' includes the necessary header files for calling
327 `select()', `fork()', `execve()', socket operations, and the like, and
328 ensures that the `FD_*()' macros for descriptor-set manipulations are
331 `syspwd.h' includes the necessary header files for obtaining
332 information from `/etc/passwd' (the functions are emulated under VMS).
334 `syssignal.h' includes the necessary header files for
335 signal-handling and provides a uniform interface onto the different
336 signal-handling and signal-blocking paradigms.
338 `systime.h' includes the necessary header files and provides uniform
339 interfaces for retrieving the time of day, setting file
340 access/modification times, getting the amount of time used by the XEmacs
343 `systty.h' buffers against the infinitude of different ways of
346 `syswait.h' provides a uniform way of retrieving the exit status
347 from a `wait()'ed-on process (some systems use a union, others use an
360 These files implement the ability to play various sounds on some
361 types of computers. You have to configure your XEmacs with sound
362 support in order to get this capability.
364 `sound.c' provides the generic interface. It implements various
365 Lisp primitives and variables that let you specify which sounds should
366 be played in certain conditions. (The conditions are identified by
367 symbols, which are passed to `ding' to make a sound. Various standard
368 functions call this function at certain times; if sound support does
369 not exist, a simple beep results.
371 `sgiplay.c', `sunplay.c', `hpplay.c', and `linuxplay.c' interface to
372 the machine's speaker for various different kind of machines. This is
373 called "native" sound.
375 `nas.c' interfaces to a computer somewhere else on the network using
376 the NAS (Network Audio Server) protocol, playing sounds on that
377 machine. This allows you to run XEmacs on a remote machine, with its
378 display set to your local machine, and have the sounds be made on your
379 local machine, provided that you have a NAS server running on your local
382 `libsst.c', `libsst.h', and `libst.h' provide some additional
383 functions for playing sound on a Sun SPARC but are not currently in use.
388 These two modules implement an interface to the ToolTalk protocol,
389 which is an interprocess communication protocol implemented on some
390 versions of Unix. ToolTalk is a high-level protocol that allows
391 processes to register themselves as providers of particular services;
392 other processes can then request a service without knowing or caring
393 exactly who is providing the service. It is similar in spirit to the
394 DDE protocol provided under Microsoft Windows. ToolTalk is a part of
395 the new CDE (Common Desktop Environment) specification and is used to
396 connect the parts of the SPARCWorks development environment.
400 This module provides the ability to retrieve the system's current
401 load average. (The way to do this is highly system-specific,
402 unfortunately, and requires a lot of special-case code.)
406 This module provides a small amount of code used internally at Sun to
407 keep statistics on the usage of XEmacs.
414 These files provide replacement functions and prototypes to fix
415 numerous bugs in early releases of SunOS 4.1.
419 This module provides some terminal-control code necessary on
420 versions of AIX prior to 4.1.
425 These modules are used for MS-DOS support, which does not work in
429 File: internals.info, Node: Modules for Interfacing with X Windows, Next: Modules for Internationalization, Prev: Modules for Interfacing with the Operating System, Up: A Summary of the Various XEmacs Modules
431 Modules for Interfacing with X Windows
432 ======================================
436 A file generated from `Emacs.ad', which contains XEmacs-supplied
437 fallback resources (so that XEmacs has pretty defaults).
443 These modules implement an Xt widget class that encapsulates a frame.
444 This is for ease in integrating with Xt. The EmacsFrame widget covers
445 the entire X window except for the menubar; the scrollbars are
446 positioned on top of the EmacsFrame widget.
448 *Warning:* Abandon hope, all ye who enter here. This code took an
449 ungodly amount of time to get right, and is likely to fall apart
450 mercilessly at the slightest change. Such is life under Xt.
456 These modules implement a simple Xt manager (i.e. composite) widget
457 class that simply lets its children set whatever geometry they want.
458 It's amazing that Xt doesn't provide this standardly, but on second
459 thought, it makes sense, considering how amazingly broken Xt is.
466 These modules implement two Xt widget classes that are subclasses of
467 the TopLevelShell and TransientShell classes. This is necessary to deal
468 with more brokenness that Xt has sadistically thrust onto the backs of
474 These modules provide functions for maintenance and caching of GC's
475 (graphics contexts) under the X Window System. This code is junky and
476 needs to be rewritten.
480 This module provides an interface to the X Window System's concept of
481 "selections", the standard way for X applications to communicate with
489 These header files are similar in spirit to the `sys*.h' files and
490 buffer against different implementations of Xt and Motif.
492 * `xintrinsic.h' should be included in place of `<Intrinsic.h>'.
494 * `xintrinsicp.h' should be included in place of `<IntrinsicP.h>'.
496 * `xmmanagerp.h' should be included in place of `<XmManagerP.h>'.
498 * `xmprimitivep.h' should be included in place of `<XmPrimitiveP.h>'.
503 These files provide an emulation of the Xmu library for those systems
504 (i.e. HPUX) that don't provide it as a standard part of X.
506 ExternalClient-Xlib.c
518 These files provide the "external widget" interface, which allows an
519 XEmacs frame to appear as a widget in another application. To do this,
520 you have to configure with `--external-widget'.
522 `ExternalShell*' provides the server (XEmacs) side of the connection.
524 `ExternalClient*' provides the client (other application) side of
525 the connection. These files are not compiled into XEmacs but are
526 compiled into libraries that are then linked into your application.
528 `extw-*' is common code that is used for both the client and server.
530 Don't touch this code; something is liable to break if you do.
533 File: internals.info, Node: Modules for Internationalization, Prev: Modules for Interfacing with X Windows, Up: A Summary of the Various XEmacs Modules
535 Modules for Internationalization
536 ================================
549 These files implement the MULE (Asian-language) support. Note that
550 MULE actually provides a general interface for all sorts of languages,
551 not just Asian languages (although they are generally the most
552 complicated to support). This code is still in beta.
554 `mule-charset.*' and `mule-coding.*' provide the heart of the XEmacs
555 MULE support. `mule-charset.*' implements the "charset" Lisp object
556 type, which encapsulates a character set (an ordered one- or
557 two-dimensional set of characters, such as US ASCII or JISX0208 Japanese
560 `mule-coding.*' implements the "coding-system" Lisp object type,
561 which encapsulates a method of converting between different encodings.
562 An encoding is a representation of a stream of characters, possibly
563 from multiple character sets, using a stream of bytes or words, and
564 defines (e.g.) which escape sequences are used to specify particular
565 character sets, how the indices for a character are converted into bytes
566 (sometimes this involves setting the high bit; sometimes complicated
567 rearranging of the values takes place, as in the Shift-JIS encoding),
570 `mule-ccl.c' provides the CCL (Code Conversion Language)
571 interpreter. CCL is similar in spirit to Lisp byte code and is used to
572 implement converters for custom encodings.
574 `mule-canna.c' and `mule-wnnfns.c' implement interfaces to external
575 programs used to implement the Canna and WNN input methods,
576 respectively. This is currently in beta.
578 `mule-mcpath.c' provides some functions to allow for pathnames
579 containing extended characters. This code is fragmentary, obsolete, and
580 completely non-working. Instead, PATHNAME-CODING-SYSTEM is used to
581 specify conversions of names of files and directories. The standard C
582 I/O functions like `open()' are wrapped so that conversion occurs
585 `mule.c' provides a few miscellaneous things that should probably be
590 This provides some miscellaneous internationalization code for
591 implementing message translation and interfacing to the Ximp input
592 method. None of this code is currently working.
596 This contains leftover code from an earlier implementation of
597 Asian-language support, and is not currently used.
600 File: internals.info, Node: Allocation of Objects in XEmacs Lisp, Next: Events and the Event Loop, Prev: A Summary of the Various XEmacs Modules, Up: Top
602 Allocation of Objects in XEmacs Lisp
603 ************************************
607 * Introduction to Allocation::
608 * Garbage Collection::
610 * Garbage Collection - Step by Step::
611 * Integers and Characters::
612 * Allocation from Frob Blocks::
614 * Low-level allocation::
622 * Compiled Function::
625 File: internals.info, Node: Introduction to Allocation, Next: Garbage Collection, Up: Allocation of Objects in XEmacs Lisp
627 Introduction to Allocation
628 ==========================
630 Emacs Lisp, like all Lisps, has garbage collection. This means that
631 the programmer never has to explicitly free (destroy) an object; it
632 happens automatically when the object becomes inaccessible. Most
633 experts agree that garbage collection is a necessity in a modern,
634 high-level language. Its omission from C stems from the fact that C was
635 originally designed to be a nice abstract layer on top of assembly
636 language, for writing kernels and basic system utilities rather than
639 Lisp objects can be created by any of a number of Lisp primitives.
640 Most object types have one or a small number of basic primitives for
641 creating objects. For conses, the basic primitive is `cons'; for
642 vectors, the primitives are `make-vector' and `vector'; for symbols,
643 the primitives are `make-symbol' and `intern'; etc. Some Lisp objects,
644 especially those that are primarily used internally, have no
645 corresponding Lisp primitives. Every Lisp object, though, has at least
646 one C primitive for creating it.
648 Recall from section (VII) that a Lisp object, as stored in a 32-bit
649 or 64-bit word, has a mark bit, a few tag bits, and a "value" that
650 occupies the remainder of the bits. We can separate the different Lisp
651 object types into four broad categories:
653 * (a) Those for whom the value directly represents the contents of
654 the Lisp object. Only two types are in this category: integers and
655 characters. No special allocation or garbage collection is
656 necessary for such objects. Lisp objects of these types do not
657 need to be `GCPRO'ed.
659 In the remaining three categories, the value is a pointer to a
662 * (b) Those for whom the tag directly specifies the type. Recall
663 that there are only three tag bits; this means that at most five
664 types can be specified this way. The most commonly-used types are
665 stored in this format; this includes conses, strings, vectors, and
666 sometimes symbols. With the exception of vectors, objects in this
667 category are allocated in "frob blocks", i.e. large blocks of
668 memory that are subdivided into individual objects. This saves a
669 lot on malloc overhead, since there are typically quite a lot of
670 these objects around, and the objects are small. (A cons, for
671 example, occupies 8 bytes on 32-bit machines - 4 bytes for each of
672 the two objects it contains.) Vectors are individually
673 `malloc()'ed since they are of variable size. (It would be
674 possible, and desirable, to allocate vectors of certain small
675 sizes out of frob blocks, but it isn't currently done.) Strings
676 are handled specially: Each string is allocated in two parts, a
677 fixed size structure containing a length and a data pointer, and
678 the actual data of the string. The former structure is allocated
679 in frob blocks as usual, and the latter data is stored in "string
680 chars blocks" and is relocated during garbage collection to
683 In the remaining two categories, the type is stored in the object
684 itself. The tag for all such objects is the generic "lrecord"
685 (Lisp_Record) tag. The first four bytes (or eight, for 64-bit machines)
686 of the object's structure are a pointer to a structure that describes
687 the object's type, which includes method pointers and a pointer to a
688 string naming the type. Note that it's possible to save some space by
689 using a one- or two-byte tag, rather than a four- or eight-byte pointer
690 to store the type, but it's not clear it's worth making the change.
692 * (c) Those lrecords that are allocated in frob blocks (see above).
693 This includes the objects that are most common and relatively
694 small, and includes floats, compiled functions, symbols (when not
695 in category (b)), extents, events, and markers. With the cleanup
696 of frob blocks done in 19.12, it's not terribly hard to add more
697 objects to this category, but it's a bit trickier than adding an
698 object type to type (d) (esp. if the object needs a finalization
699 method), and is not likely to save much space unless the object is
700 small and there are many of them. (In fact, if there are very few
701 of them, it might actually waste space.)
703 * (d) Those lrecords that are individually `malloc()'ed. These are
704 called "lcrecords". All other types are in this category. Adding
705 a new type to this category is comparatively easy, and all types
706 added since 19.8 (when the current allocation scheme was devised,
707 by Richard Mlynarik), with the exception of the character type,
708 have been in this category.
710 Note that bit vectors are a bit of a special case. They are simple
711 lrecords as in category (c), but are individually `malloc()'ed like
712 vectors. You can basically view them as exactly like vectors except
713 that their type is stored in lrecord fashion rather than in
714 directly-tagged fashion.
716 Note that FSF Emacs redesigned their object system in 19.29 to follow
717 a similar scheme. However, given RMS's expressed dislike for data
718 abstraction, the FSF scheme is not nearly as clean or as easy to
719 extend. (FSF calls items of type (c) `Lisp_Misc' and items of type (d)
720 `Lisp_Vectorlike', with separate tags for each, although
721 `Lisp_Vectorlike' is also used for vectors.)
724 File: internals.info, Node: Garbage Collection, Next: GCPROing, Prev: Introduction to Allocation, Up: Allocation of Objects in XEmacs Lisp
729 Garbage collection is simple in theory but tricky to implement.
730 Emacs Lisp uses the oldest garbage collection method, called "mark and
731 sweep". Garbage collection begins by starting with all accessible
732 locations (i.e. all variables and other slots where Lisp objects might
733 occur) and recursively traversing all objects accessible from those
734 slots, marking each one that is found. We then go through all of
735 memory and free each object that is not marked, and unmarking each
736 object that is marked. Note that "all of memory" means all currently
737 allocated objects. Traversing all these objects means traversing all
738 frob blocks, all vectors (which are chained in one big list), and all
739 lcrecords (which are likewise chained).
741 Note that, when an object is marked, the mark has to occur inside of
742 the object's structure, rather than in the 32-bit `Lisp_Object' holding
743 the object's pointer; i.e. you can't just set the pointer's mark bit.
744 This is because there may be many pointers to the same object. This
745 means that the method of marking an object can differ depending on the
746 type. The different marking methods are approximately as follows:
748 1. For conses, the mark bit of the car is set.
750 2. For strings, the mark bit of the string's plist is set.
752 3. For symbols when not lrecords, the mark bit of the symbol's plist
755 4. For vectors, the length is negated after adding 1.
757 5. For lrecords, the pointer to the structure describing the type is
760 6. Integers and characters do not need to be marked, since no
761 allocation occurs for them.
763 The details of this are in the `mark_object()' function.
765 Note that any code that operates during garbage collection has to be
766 especially careful because of the fact that some objects may be marked
767 and as such may not look like they normally do. In particular:
769 Some object pointers may have their mark bit set. This will make
770 `FOOBARP()' predicates fail. Use `GC_FOOBARP()' to deal with this.
772 * Even if you clear the mark bit, `FOOBARP()' will still fail for
773 lrecords because the implementation pointer has been changed (see
774 below). `GC_FOOBARP()' will correctly deal with this.
776 * Vectors have their size field munged, so anything that looks at
777 this field will fail.
779 * Note that `XFOOBAR()' macros *will* work correctly on object
780 pointers with their mark bit set, because the logical shift
781 operations that remove the tag also remove the mark bit.
783 Finally, note that garbage collection can be invoked explicitly by
784 calling `garbage-collect' but is also called automatically by `eval',
785 once a certain amount of memory has been allocated since the last
786 garbage collection (according to `gc-cons-threshold').
789 File: internals.info, Node: GCPROing, Next: Garbage Collection - Step by Step, Prev: Garbage Collection, Up: Allocation of Objects in XEmacs Lisp
794 `GCPRO'ing is one of the ugliest and trickiest parts of Emacs
795 internals. The basic idea is that whenever garbage collection occurs,
796 all in-use objects must be reachable somehow or other from one of the
797 roots of accessibility. The roots of accessibility are:
799 1. All objects that have been `staticpro()'d. This is used for any
800 global C variables that hold Lisp objects. A call to
801 `staticpro()' happens implicitly as a result of any symbols
802 declared with `defsymbol()' and any variables declared with
803 `DEFVAR_FOO()'. You need to explicitly call `staticpro()' (in the
804 `vars_of_foo()' method of a module) for other global C variables
805 holding Lisp objects. (This typically includes internal lists and
808 Note that `obarray' is one of the `staticpro()'d things.
809 Therefore, all functions and variables get marked through this.
811 2. Any shadowed bindings that are sitting on the `specpdl' stack.
813 3. Any objects sitting in currently active (Lisp) stack frames,
814 catches, and condition cases.
816 4. A couple of special-case places where active objects are located.
818 5. Anything currently marked with `GCPRO'.
820 Marking with `GCPRO' is necessary because some C functions (quite a
821 lot, in fact), allocate objects during their operation. Quite
822 frequently, there will be no other pointer to the object while the
823 function is running, and if a garbage collection occurs and the object
824 needs to be referenced again, bad things will happen. The solution is
825 to mark those objects with `GCPRO'. Unfortunately this is easy to
826 forget, and there is basically no way around this problem. Here are
829 1. For every `GCPRON', there have to be declarations of `struct gcpro
830 gcpro1, gcpro2', etc.
832 2. You *must* `UNGCPRO' anything that's `GCPRO'ed, and you *must not*
833 `UNGCPRO' if you haven't `GCPRO'ed. Getting either of these wrong
834 will lead to crashes, often in completely random places unrelated
835 to where the problem lies.
837 3. The way this actually works is that all currently active `GCPRO's
838 are chained through the `struct gcpro' local variables, with the
839 variable `gcprolist' pointing to the head of the list and the nth
840 local `gcpro' variable pointing to the first `gcpro' variable in
841 the next enclosing stack frame. Each `GCPRO'ed thing is an
842 lvalue, and the `struct gcpro' local variable contains a pointer to
843 this lvalue. This is why things will mess up badly if you don't
844 pair up the `GCPRO's and `UNGCPRO's - you will end up with
845 `gcprolist's containing pointers to `struct gcpro's or local
846 `Lisp_Object' variables in no-longer-active stack frames.
848 4. It is actually possible for a single `struct gcpro' to protect a
849 contiguous array of any number of values, rather than just a
850 single lvalue. To effect this, call `GCPRON' as usual on the
851 first object in the array and then set `gcproN.nvars'.
853 5. *Strings are relocated.* What this means in practice is that the
854 pointer obtained using `XSTRING_DATA()' is liable to change at any
855 time, and you should never keep it around past any function call,
856 or pass it as an argument to any function that might cause a
857 garbage collection. This is why a number of functions accept
858 either a "non-relocatable" `char *' pointer or a relocatable Lisp
859 string, and only access the Lisp string's data at the very last
860 minute. In some cases, you may end up having to `alloca()' some
861 space and copy the string's data into it.
863 6. By convention, if you have to nest `GCPRO''s, use `NGCPRON' (along
864 with `struct gcpro ngcpro1, ngcpro2', etc.), `NNGCPRON', etc.
865 This avoids compiler warnings about shadowed locals.
867 7. It is *always* better to err on the side of extra `GCPRO's rather
868 than too few. The extra cycles spent on this are almost never
869 going to make a whit of difference in the speed of anything.
871 8. The general rule to follow is that caller, not callee, `GCPRO's.
872 That is, you should not have to explicitly `GCPRO' any Lisp objects
873 that are passed in as parameters.
875 One exception from this rule is if you ever plan to change the
876 parameter value, and store a new object in it. In that case, you
877 *must* `GCPRO' the parameter, because otherwise the new object
878 will not be protected.
880 So, if you create any Lisp objects (remember, this happens in all
881 sorts of circumstances, e.g. with `Fcons()', etc.), you are
882 responsible for `GCPRO'ing them, unless you are *absolutely sure*
883 that there's no possibility that a garbage-collection can occur
884 while you need to use the object. Even then, consider `GCPRO'ing.
886 9. A garbage collection can occur whenever anything calls `Feval', or
887 whenever a QUIT can occur where execution can continue past this.
888 (Remember, this is almost anywhere.)
890 10. If you have the *least smidgeon of doubt* about whether you need
891 to `GCPRO', you should `GCPRO'.
893 11. Beware of `GCPRO'ing something that is uninitialized. If you have
894 any shade of doubt about this, initialize all your variables to
897 12. Be careful of traps, like calling `Fcons()' in the argument to
898 another function. By the "caller protects" law, you should be
899 `GCPRO'ing the newly-created cons, but you aren't. A certain
900 number of functions that are commonly called on freshly created
901 stuff (e.g. `nconc2()', `Fsignal()'), break the "caller protects"
902 law and go ahead and `GCPRO' their arguments so as to simplify
903 things, but make sure and check if it's OK whenever doing
906 13. Once again, remember to `GCPRO'! Bugs resulting from insufficient
907 `GCPRO'ing are intermittent and extremely difficult to track down,
908 often showing up in crashes inside of `garbage-collect' or in
909 weirdly corrupted objects or even in incorrect values in a totally
910 different section of code.
912 Given the extremely error-prone nature of the `GCPRO' scheme, and
913 the difficulties in tracking down, it should be considered a deficiency
914 in the XEmacs code. A solution to this problem would involve
915 implementing so-called "conservative" garbage collection for the C
916 stack. That involves looking through all of stack memory and treating
917 anything that looks like a reference to an object as a reference. This
918 will result in a few objects not getting collected when they should, but
919 it obviates the need for `GCPRO'ing, and allows garbage collection to
920 happen at any point at all, such as during object allocation.
923 File: internals.info, Node: Garbage Collection - Step by Step, Next: Integers and Characters, Prev: GCPROing, Up: Allocation of Objects in XEmacs Lisp
925 Garbage Collection - Step by Step
926 =================================
931 * garbage_collect_1::
934 * sweep_lcrecords_1::
935 * compact_string_chars::
937 * sweep_bit_vectors_1::
940 File: internals.info, Node: Invocation, Next: garbage_collect_1, Up: Garbage Collection - Step by Step
945 The first thing that anyone should know about garbage collection is:
946 when and how the garbage collector is invoked. One might think that this
947 could happen every time new memory is allocated, e.g. new objects are
948 created, but this is *not* the case. Instead, we have the following
951 The entry point of any process of garbage collection is an invocation
952 of the function `garbage_collect_1' in file `alloc.c'. The invocation
953 can occur *explicitly* by calling the function `Fgarbage_collect' (in
954 addition this function provides information about the freed memory), or
955 can occur *implicitly* in four different situations:
956 1. In function `main_1' in file `emacs.c'. This function is called at
957 each startup of xemacs. The garbage collection is invoked after all
958 initial creations are completed, but only if a special internal
959 error checking-constant `ERROR_CHECK_GC' is defined.
961 2. In function `disksave_object_finalization' in file `alloc.c'. The
962 only purpose of this function is to clear the objects from memory
963 which need not be stored with xemacs when we dump out an
964 executable. This is only done by `Fdump_emacs' or by
965 `Fdump_emacs_data' respectively (both in `emacs.c'). The actual
966 clearing is accomplished by making these objects unreachable and
967 starting a garbage collection. The function is only used while
970 3. In function `Feval / eval' in file `eval.c'. Each time the well
971 known and often used function eval is called to evaluate a form,
972 one of the first things that could happen, is a potential call of
973 `garbage_collect_1'. There exist three global variables,
974 `consing_since_gc' (counts the created cons-cells since the last
975 garbage collection), `gc_cons_threshold' (a specified threshold
976 after which a garbage collection occurs) and `always_gc'. If
977 `always_gc' is set or if the threshold is exceeded, the garbage
978 collection will start.
980 4. In function `Ffuncall / funcall' in file `eval.c'. This function
981 evaluates calls of elisp functions and works according to `Feval'.
983 The upshot is that garbage collection can basically occur everywhere
984 `Feval', respectively `Ffuncall', is used - either directly or through
985 another function. Since calls to these two functions are hidden in
986 various other functions, many calls to `garabge_collect_1' are not
987 obviously foreseeable, and therefore unexpected. Instances where they
988 are used that are worth remembering are various elisp commands, as for
989 example `or', `and', `if', `cond', `while', `setq', etc., miscellaneous
990 `gui_item_...' functions, everything related to `eval' (`Feval_buffer',
991 `call0', ...) and inside `Fsignal'. The latter is used to handle
992 signals, as for example the ones raised by every `QUITE'-macro
993 triggered after pressing Ctrl-g.