Foundation instead of in the original English.
\1f
-File: internals.info, Node: garbage_collect_1, Next: mark_object, Prev: Invocation, Up: Garbage Collection - Step by Step
-
-`garbage_collect_1'
--------------------
-
- We can now describe exactly what happens after the invocation takes
-place.
- 1. There are several cases in which the garbage collector is left
- immediately: when we are already garbage collecting
- (`gc_in_progress'), when the garbage collection is somehow
- forbidden (`gc_currently_forbidden'), when we are currently
- displaying something (`in_display') or when we are preparing for
- the armageddon of the whole system (`preparing_for_armageddon').
-
- 2. Next the correct frame in which to put all the output occurring
- during garbage collecting is determined. In order to be able to
- restore the old display's state after displaying the message, some
- data about the current cursor position has to be saved. The
- variables `pre_gc_curser' and `cursor_changed' take care of that.
-
- 3. The state of `gc_currently_forbidden' must be restored after the
- garbage collection, no matter what happens during the process. We
- accomplish this by `record_unwind_protect'ing the suitable function
- `restore_gc_inhibit' together with the current value of
- `gc_currently_forbidden'.
-
- 4. If we are concurrently running an interactive xemacs session, the
- next step is simply to show the garbage collector's cursor/message.
-
- 5. The following steps are the intrinsic steps of the garbage
- collector, therefore `gc_in_progress' is set.
-
- 6. For debugging purposes, it is possible to copy the current C stack
- frame. However, this seems to be a currently unused feature.
-
- 7. Before actually starting to go over all live objects, references to
- objects that are no longer used are pruned. We only have to do
- this for events (`clear_event_resource') and for specifiers
- (`cleanup_specifiers').
-
- 8. Now the mark phase begins and marks all accessible elements. In
- order to start from all slots that serve as roots of
- accessibility, the function `mark_object' is called for each root
- individually to go out from there to mark all reachable objects.
- All roots that are traversed are shown in their processed order:
- * all constant symbols and static variables that are registered
- via `staticpro' in the array `staticvec'. *Note Adding
- Global Lisp Variables::.
-
- * all Lisp objects that are created in C functions and that
- must be protected from freeing them. They are registered in
- the global list `gcprolist'. *Note GCPROing::.
-
- * all local variables (i.e. their name fields `symbol' and old
- values `old_values') that are bound during the evaluation by
- the Lisp engine. They are stored in `specbinding' structs
- pushed on a stack called `specpdl'. *Note Dynamic Binding;
- The specbinding Stack; Unwind-Protects::.
-
- * all catch blocks that the Lisp engine encounters during the
- evaluation cause the creation of structs `catchtag' inserted
- in the list `catchlist'. Their tag (`tag') and value (`val'
- fields are freshly created objects and therefore have to be
- marked. *Note Catch and Throw::.
-
- * every function application pushes new structs `backtrace' on
- the call stack of the Lisp engine (`backtrace_list'). The
- unique parts that have to be marked are the fields for each
- function (`function') and all their arguments (`args').
- *Note Evaluation::.
-
- * all objects that are used by the redisplay engine that must
- not be freed are marked by a special function called
- `mark_redisplay' (in `redisplay.c').
-
- * all objects created for profiling purposes are allocated by C
- functions instead of using the lisp allocation mechanisms. In
- order to receive the right ones during the sweep phase, they
- also have to be marked manually. That is done by the function
- `mark_profiling_info'
-
- 9. Hash tables in XEmacs belong to a kind of special objects that
- make use of a concept often called 'weak pointers'. To make a
- long story short, these kind of pointers are not followed during
- the estimation of the live objects during garbage collection. Any
- object referenced only by weak pointers is collected anyway, and
- the reference to it is cleared. In hash tables there are different
- usage patterns of them, manifesting in different types of hash
- tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak'
- (internally also 'key-car-weak' and 'value-car-weak') hash tables,
- each clearing entries depending on different conditions. More
- information can be found in the documentation to the function
- `make-hash-table'.
-
- Because there are complicated dependency rules about when and what
- to mark while processing weak hash tables, the standard `marker'
- method is only active if it is marking non-weak hash tables. As
- soon as a weak component is in the table, the hash table entries
- are ignored while marking. Instead their marking is done each
- separately by the function `finish_marking_weak_hash_tables'. This
- function iterates over each hash table entry `hentries' for each
- weak hash table in `Vall_weak_hash_tables'. Depending on the type
- of a table, the appropriate action is performed. If a table is
- acting as `HASH_TABLE_KEY_WEAK', and a key already marked,
- everything reachable from the `value' component is marked. If it is
- acting as a `HASH_TABLE_VALUE_WEAK' and the value component is
- already marked, the marking starts beginning only from the `key'
- component. If it is a `HASH_TABLE_KEY_CAR_WEAK' and the car of
- the key entry is already marked, we mark both the `key' and
- `value' components. Finally, if the table is of the type
- `HASH_TABLE_VALUE_CAR_WEAK' and the car of the value components is
- already marked, again both the `key' and the `value' components
- get marked.
-
- Again, there are lists with comparable properties called weak
- lists. There exist different peculiarities of their types called
- `simple', `assoc', `key-assoc' and `value-assoc'. You can find
- further details about them in the description to the function
- `make-weak-list'. The scheme of their marking is similar: all weak
- lists are listed in `Qall_weak_lists', therefore we iterate over
- them. The marking is advanced until we hit an already marked pair.
- Then we know that during a former run all the rest has been marked
- completely. Again, depending on the special type of the weak list,
- our jobs differ. If it is a `WEAK_LIST_SIMPLE' and the elem is
- marked, we mark the `cons' part. If it is a `WEAK_LIST_ASSOC' and
- not a pair or a pair with both marked car and cdr, we mark the
- `cons' and the `elem'. If it is a `WEAK_LIST_KEY_ASSOC' and not a
- pair or a pair with a marked car of the elem, we mark the `cons'
- and the `elem'. Finally, if it is a `WEAK_LIST_VALUE_ASSOC' and
- not a pair or a pair with a marked cdr of the elem, we mark both
- the `cons' and the `elem'.
-
- Since, by marking objects in reach from weak hash tables and weak
- lists, other objects could get marked, this perhaps implies
- further marking of other weak objects, both finishing functions
- are redone as long as yet unmarked objects get freshly marked.
-
- 10. After completing the special marking for the weak hash tables and
- for the weak lists, all entries that point to objects that are
- going to be swept in the further process are useless, and
- therefore have to be removed from the table or the list.
-
- The function `prune_weak_hash_tables' does the job for weak hash
- tables. Totally unmarked hash tables are removed from the list
- `Vall_weak_hash_tables'. The other ones are treated more carefully
- by scanning over all entries and removing one as soon as one of
- the components `key' and `value' is unmarked.
-
- The same idea applies to the weak lists. It is accomplished by
- `prune_weak_lists': An unmarked list is pruned from
- `Vall_weak_lists' immediately. A marked list is treated more
- carefully by going over it and removing just the unmarked pairs.
-
- 11. The function `prune_specifiers' checks all listed specifiers held
- in `Vall_speficiers' and removes the ones from the lists that are
- unmarked.
-
- 12. All syntax tables are stored in a list called
- `Vall_syntax_tables'. The function `prune_syntax_tables' walks
- through it and unlinks the tables that are unmarked.
-
- 13. Next, we will attack the complete sweeping - the function
- `gc_sweep' which holds the predominance.
-
- 14. First, all the variables with respect to garbage collection are
- reset. `consing_since_gc' - the counter of the created cells since
- the last garbage collection - is set back to 0, and
- `gc_in_progress' is not `true' anymore.
-
- 15. In case the session is interactive, the displayed cursor and
- message are removed again.
-
- 16. The state of `gc_inhibit' is restored to the former value by
- unwinding the stack.
-
- 17. A small memory reserve is always held back that can be reached by
- `breathing_space'. If nothing more is left, we create a new reserve
- and exit.
-
-\1f
-File: internals.info, Node: mark_object, Next: gc_sweep, Prev: garbage_collect_1, Up: Garbage Collection - Step by Step
-
-`mark_object'
--------------
-
- The first thing that is checked while marking an object is whether
-the object is a real Lisp object `Lisp_Type_Record' or just an integer
-or a character. Integers and characters are the only two types that are
-stored directly - without another level of indirection, and therefore
-they don't have to be marked and collected. *Note How Lisp Objects Are
-Represented in C::.
-
- The second case is the one we have to handle. It is the one when we
-are dealing with a pointer to a Lisp object. But, there exist also three
-possibilities, that prevent us from doing anything while marking: The
-object is read only which prevents it from being garbage collected,
-i.e. marked (`C_READONLY_RECORD_HEADER'). The object in question is
-already marked, and need not be marked for the second time (checked by
-`MARKED_RECORD_HEADER_P'). If it is a special, unmarkable object
-(`UNMARKABLE_RECORD_HEADER_P', apparently, these are objects that sit
-in some CONST space, and can therefore not be marked, see
-`this_one_is_unmarkable' in `alloc.c').
-
- Now, the actual marking is feasible. We do so by once using the macro
-`MARK_RECORD_HEADER' to mark the object itself (actually the special
-flag in the lrecord header), and calling its special marker "method"
-`marker' if available. The marker method marks every other object that
-is in reach from our current object. Note, that these marker methods
-should not call `mark_object' recursively, but instead should return
-the next object from where further marking has to be performed.
-
- In case another object was returned, as mentioned before, we
-reiterate the whole `mark_object' process beginning with this next
-object.
-
-\1f
File: internals.info, Node: gc_sweep, Next: sweep_lcrecords_1, Prev: mark_object, Up: Garbage Collection - Step by Step
`gc_sweep'
Our next candidates are the other objects that behave quite
differently than everything else: the strings. They consists of two
-parts, a fixed-size portion (`struct Lisp_string') holding the string's
+parts, a fixed-size portion (`struct Lisp_String') holding the string's
length, its property list and a pointer to the second part, and the
actual string data, which is stored in string-chars blocks comparable to
frob blocks. In this block, the data is not only freed, but also a
[see `lrecord.h']
All lrecords have at the beginning of their structure a `struct
-lrecord_header'. This just contains a pointer to a `struct
+lrecord_header'. This just contains a type number and some flags,
+including the mark bit. All builtin type numbers are defined as
+constants in `enum lrecord_type', to allow the compiler to generate
+more efficient code for `TYPEP'. The type number, thru the
+`lrecord_implementation_table', gives access to a `struct
lrecord_implementation', which is a structure containing method pointers
and such. There is one of these for each type, and it is a global,
constant, statically-declared structure that is declared in the
-`DEFINE_LRECORD_IMPLEMENTATION()' macro. (This macro actually declares
-an array of two `struct lrecord_implementation' structures. The first
-one contains all the standard method pointers, and is used in all
-normal circumstances. During garbage collection, however, the lrecord
-is "marked" by bumping its implementation pointer by one, so that it
-points to the second structure in the array. This structure contains a
-special indication in it that it's a "marked-object" structure: the
-finalize method is the special function `this_marks_a_marked_record()',
-and all other methods are null pointers. At the end of garbage
-collection, all lrecords will either be reclaimed or unmarked by
-decrementing their implementation pointers, so this second structure
-pointer will never remain past garbage collection.
-
- Simple lrecords (of type (c) above) just have a `struct
+`DEFINE_LRECORD_IMPLEMENTATION()' macro.
+
+ Simple lrecords (of type (b) above) just have a `struct
lrecord_header' at their beginning. lcrecords, however, actually have a
`struct lcrecord_header'. This, in turn, has a `struct lrecord_header'
at its beginning, so sanity is preserved; but it also has a pointer
Whenever you create an lrecord, you need to call either
`DEFINE_LRECORD_IMPLEMENTATION()' or
`DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()'. This needs to be specified
-in a C file, at the top level. What this actually does is define and
-initialize the implementation structure for the lrecord. (And possibly
-declares a function `error_check_foo()' that implements the `XFOO()'
-macro when error-checking is enabled.) The arguments to the macros are
-the actual type name (this is used to construct the C variable name of
-the lrecord implementation structure and related structures using the
-`##' macro concatenation operator), a string that names the type on the
-Lisp level (this may not be the same as the C type name; typically, the
-C type name has underscores, while the Lisp string has dashes), various
-method pointers, and the name of the C structure that contains the
-object. The methods are used to encapsulate type-specific information
-about the object, such as how to print it or mark it for garbage
-collection, so that it's easy to add new object types without having to
-add a specific case for each new type in a bunch of different places.
+in a `.c' file, at the top level. What this actually does is define
+and initialize the implementation structure for the lrecord. (And
+possibly declares a function `error_check_foo()' that implements the
+`XFOO()' macro when error-checking is enabled.) The arguments to the
+macros are the actual type name (this is used to construct the C
+variable name of the lrecord implementation structure and related
+structures using the `##' macro concatenation operator), a string that
+names the type on the Lisp level (this may not be the same as the C
+type name; typically, the C type name has underscores, while the Lisp
+string has dashes), various method pointers, and the name of the C
+structure that contains the object. The methods are used to
+encapsulate type-specific information about the object, such as how to
+print it or mark it for garbage collection, so that it's easy to add
+new object types without having to add a specific case for each new
+type in a bunch of different places.
The difference between `DEFINE_LRECORD_IMPLEMENTATION()' and
`DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()' is that the former is used
For the purpose of keeping allocation statistics, the allocation
engine keeps a list of all the different types that exist. Note that,
since `DEFINE_LRECORD_IMPLEMENTATION()' is a macro that is specified at
-top-level, there is no way for it to add to the list of all existing
-types. What happens instead is that each implementation structure
-contains in it a dynamically assigned number that is particular to that
-type. (Or rather, it contains a pointer to another structure that
-contains this number. This evasiveness is done so that the
-implementation structure can be declared const.) In the sweep stage of
-garbage collection, each lrecord is examined to see if its
-implementation structure has its dynamically-assigned number set. If
-not, it must be a new type, and it is added to the list of known types
-and a new number assigned. The number is used to index into an array
-holding the number of objects of each type and the total memory
-allocated for objects of that type. The statistics in this array are
-also computed during the sweep stage. These statistics are returned by
-the call to `garbage-collect' and are printed out at the end of the
-loadup phase.
+top-level, there is no way for it to initialize the global data
+structures containing type information, like
+`lrecord_implementations_table'. For this reason a call to
+`INIT_LRECORD_IMPLEMENTATION' must be added to the same source file
+containing `DEFINE_LRECORD_IMPLEMENTATION', but instead of to the top
+level, to one of the init functions, typically `syms_of_FOO.c'.
+`INIT_LRECORD_IMPLEMENTATION' must be called before an object of this
+type is used.
+
+ The type number is also used to index into an array holding the
+number of objects of each type and the total memory allocated for
+objects of that type. The statistics in this array are computed during
+the sweep stage. These statistics are returned by the call to
+`garbage-collect'.
Note that for every type defined with a `DEFINE_LRECORD_*()' macro,
there needs to be a `DECLARE_LRECORD_IMPLEMENTATION()' somewhere in a
configurations and opaques.
\1f
-File: internals.info, Node: Low-level allocation, Next: Pure Space, Prev: lrecords, Up: Allocation of Objects in XEmacs Lisp
+File: internals.info, Node: Low-level allocation, Next: Cons, Prev: lrecords, Up: Allocation of Objects in XEmacs Lisp
Low-level allocation
====================
the memory warnings are not functional.)
Allocated memory that is going to be used to make a Lisp object is
-created using `allocate_lisp_storage()'. This calls `xmalloc()' but
-also verifies that the pointer to the memory can fit into a Lisp word
-(remember that some bits are taken away for a type tag and a mark bit).
-If not, an error is issued through `memory_full()'.
+created using `allocate_lisp_storage()'. This just calls `xmalloc()'.
+It used to verify that the pointer to the memory can fit into a Lisp
+word, before the current Lisp object representation was introduced.
`allocate_lisp_storage()' is called by `alloc_lcrecord()',
`ALLOCATE_FIXED_TYPE()', and the vector and bit-vector creation
routines. These routines also call `INCREMENT_CONS_COUNTER()' at the
is reached.
\1f
-File: internals.info, Node: Pure Space, Next: Cons, Prev: Low-level allocation, Up: Allocation of Objects in XEmacs Lisp
-
-Pure Space
-==========
-
- Not yet documented.
-
-\1f
-File: internals.info, Node: Cons, Next: Vector, Prev: Pure Space, Up: Allocation of Objects in XEmacs Lisp
+File: internals.info, Node: Cons, Next: Vector, Prev: Low-level allocation, Up: Allocation of Objects in XEmacs Lisp
Cons
====
Symbol
======
- Symbols are also allocated in frob blocks. Note that the code
-exists for symbols to be either lrecords (category (c) above) or simple
-types (category (b) above), and are lrecords by default (I think),
-although there is no good reason for this.
-
- Note that symbols in the awful horrible obarray structure are
-chained through their `next' field.
+ Symbols are also allocated in frob blocks. Symbols in the awful
+horrible obarray structure are chained through their `next' field.
Remember that `intern' looks up a symbol in an obarray, creating one
if necessary.
Not yet documented.
\1f
-File: internals.info, Node: Events and the Event Loop, Next: Evaluation; Stack Frames; Bindings, Prev: Allocation of Objects in XEmacs Lisp, Up: Top
+File: internals.info, Node: Dumping, Next: Events and the Event Loop, Prev: Allocation of Objects in XEmacs Lisp, Up: Top
+
+Dumping
+*******
+
+What is dumping and its justification
+=====================================
+
+ The C code of XEmacs is just a Lisp engine with a lot of built-in
+primitives useful for writing an editor. The editor itself is written
+mostly in Lisp, and represents around 100K lines of code. Loading and
+executing the initialization of all this code takes a bit a time (five
+to ten times the usual startup time of current xemacs) and requires
+having all the lisp source files around. Having to reload them each
+time the editor is started would not be acceptable.
+
+ The traditional solution to this problem is called dumping: the build
+process first creates the lisp engine under the name `temacs', then
+runs it until it has finished loading and initializing all the lisp
+code, and eventually creates a new executable called `xemacs' including
+both the object code in `temacs' and all the contents of the memory
+after the initialization.
+
+ This solution, while working, has a huge problem: the creation of the
+new executable from the actual contents of memory is an extremely
+system-specific process, quite error-prone, and which interferes with a
+lot of system libraries (like malloc). It is even getting worse
+nowadays with libraries using constructors which are automatically
+called when the program is started (even before main()) which tend to
+crash when they are called multiple times, once before dumping and once
+after (IRIX 6.x libz.so pulls in some C++ image libraries thru
+dependencies which have this problem). Writing the dumper is also one
+of the most difficult parts of porting XEmacs to a new operating system.
+Basically, `dumping' is an operation that is just not officially
+supported on many operating systems.
+
+ The aim of the portable dumper is to solve the same problem as the
+system-specific dumper, that is to be able to reload quickly, using only
+a small number of files, the fully initialized lisp part of the editor,
+without any system-specific hacks.
+
+* Menu:
+
+* Overview::
+* Data descriptions::
+* Dumping phase::
+* Reloading phase::
+* Remaining issues::
+
+\1f
+File: internals.info, Node: Overview, Next: Data descriptions, Prev: Dumping, Up: Dumping
+
+Overview
+========
+
+ The portable dumping system has to:
+
+ 1. At dump time, write all initialized, non-quickly-rebuildable data
+ to a file [Note: currently named `xemacs.dmp', but the name will
+ change], along with all informations needed for the reloading.
+
+ 2. When starting xemacs, reload the dump file, relocate it to its new
+ starting address if needed, and reinitialize all pointers to this
+ data. Also, rebuild all the quickly rebuildable data.
+
+\1f
+File: internals.info, Node: Data descriptions, Next: Dumping phase, Prev: Overview, Up: Dumping
+
+Data descriptions
+=================
+
+ The more complex task of the dumper is to be able to write lisp
+objects (lrecords) and C structs to disk and reload them at a different
+address, updating all the pointers they include in the process. This
+is done by using external data descriptions that give information about
+the layout of the structures in memory.
+
+ The specification of these descriptions is in lrecord.h. A
+description of an lrecord is an array of struct lrecord_description.
+Each of these structs include a type, an offset in the structure and
+some optional parameters depending on the type. For instance, here is
+the string description:
+
+ static const struct lrecord_description string_description[] = {
+ { XD_BYTECOUNT, offsetof (Lisp_String, size) },
+ { XD_OPAQUE_DATA_PTR, offsetof (Lisp_String, data), XD_INDIRECT(0, 1) },
+ { XD_LISP_OBJECT, offsetof (Lisp_String, plist) },
+ { XD_END }
+ };
+
+ The first line indicates a member of type Bytecount, which is used by
+the next, indirect directive. The second means "there is a pointer to
+some opaque data in the field `data'". The length of said data is
+given by the expression `XD_INDIRECT(0, 1)', which means "the value in
+the 0th line of the description (welcome to C) plus one". The third
+line means "there is a Lisp_Object member `plist' in the Lisp_String
+structure". `XD_END' then ends the description.
+
+ This gives us all the information we need to move around what is
+pointed to by a structure (C or lrecord) and, by transitivity,
+everything that it points to. The only missing information for dumping
+is the size of the structure. For lrecords, this is part of the
+lrecord_implementation, so we don't need to duplicate it. For C
+structures we use a struct struct_description, which includes a size
+field and a pointer to an associated array of lrecord_description.
+
+\1f
+File: internals.info, Node: Dumping phase, Next: Reloading phase, Prev: Data descriptions, Up: Dumping
+
+Dumping phase
+=============
+
+ Dumping is done by calling the function pdump() (in dumper.c) which
+is invoked from Fdump_emacs (in emacs.c). This function performs a
+number of tasks.
+
+* Menu:
+
+* Object inventory::
+* Address allocation::
+* The header::
+* Data dumping::
+* Pointers dumping::
+
+\1f
+File: internals.info, Node: Object inventory, Next: Address allocation, Prev: Dumping phase, Up: Dumping phase
+
+Object inventory
+----------------
+
+ The first task is to build the list of the objects to dump. This
+includes:
+
+ * lisp objects
+
+ * C structures
+
+ We end up with one `pdump_entry_list_elmt' per object group (arrays
+of C structs are kept together) which includes a pointer to the first
+object of the group, the per-object size and the count of objects in the
+group, along with some other information which is initialized later.
+
+ These entries are linked together in `pdump_entry_list' structures
+and can be enumerated thru either:
+
+ 1. the `pdump_object_table', an array of `pdump_entry_list', one per
+ lrecord type, indexed by type number.
+
+ 2. the `pdump_opaque_data_list', used for the opaque data which does
+ not include pointers, and hence does not need descriptions.
+
+ 3. the `pdump_struct_table', which is a vector of
+ `struct_description'/`pdump_entry_list' pairs, used for non-opaque
+ C structures.
+
+ This uses a marking strategy similar to the garbage collector. Some
+differences though:
+
+ 1. We do not use the mark bit (which does not exist for C structures
+ anyway), we use a big hash table instead.
+
+ 2. We do not use the mark function of lrecords but instead rely on the
+ external descriptions. This happens essentially because we need to
+ follow pointers to C structures and opaque data in addition to
+ Lisp_Object members.
+
+ This is done by `pdump_register_object', which handles Lisp_Object
+variables, and pdump_register_struct which handles C structures, which
+both delegate the description management to pdump_register_sub.
+
+ The hash table doubles as a map object to pdump_entry_list_elmt (i.e.
+allows us to look up a pdump_entry_list_elmt with the object it points
+to). Entries are added with `pdump_add_entry()' and looked up with
+`pdump_get_entry()'. There is no need for entry removal. The hash
+value is computed quite basically from the object pointer by
+`pdump_make_hash()'.
+
+ The roots for the marking are:
+
+ 1. the `staticpro''ed variables (there is a special
+ `staticpro_nodump()' call for protected variables we do not want
+ to dump).
+
+ 2. the `pdump_wire''d variables (`staticpro' is equivalent to
+ `staticpro_nodump()' + `pdump_wire()').
+
+ 3. the `dumpstruct''ed variables, which points to C structures.
+
+ This does not include the GCPRO'ed variables, the specbinds, the
+catchtags, the backlist, the redisplay or the profiling info, since we
+do not want to rebuild the actual chain of lisp calls which end up to
+the dump-emacs call, only the global variables.
+
+ Weak lists and weak hash tables are dumped as if they were their
+non-weak equivalent (without changing their type, of course). This has
+not yet been a problem.
+
+\1f
+File: internals.info, Node: Address allocation, Next: The header, Prev: Object inventory, Up: Dumping phase
+
+Address allocation
+------------------
+
+ The next step is to allocate the offsets of each of the objects in
+the final dump file. This is done by `pdump_allocate_offset()' which
+is called indirectly by `pdump_scan_by_alignment()'.
+
+ The strategy to deal with alignment problems uses these facts:
+
+ 1. real world alignment requirements are powers of two.
+
+ 2. the C compiler is required to adjust the size of a struct so that
+ you can have an array of them next to each other. This means you
+ can have a upper bound of the alignment requirements of a given
+ structure by looking at which power of two its size is a multiple.
+
+ 3. the non-variant part of variable size lrecords has an alignment
+ requirement of 4.
+
+ Hence, for each lrecord type, C struct type or opaque data block the
+alignment requirement is computed as a power of two, with a minimum of
+2^2 for lrecords. `pdump_scan_by_alignment()' then scans all the
+`pdump_entry_list_elmt''s, the ones with the highest requirements
+first. This ensures the best packing.
+
+ The maximum alignment requirement we take into account is 2^8.
+
+ `pdump_allocate_offset()' only has to do a linear allocation,
+starting at offset 256 (this leaves room for the header and keep the
+alignments happy).
+
+\1f
+File: internals.info, Node: The header, Next: Data dumping, Prev: Address allocation, Up: Dumping phase
+
+The header
+----------
+
+ The next step creates the file and writes a header with a signature
+and some random informations in it (number of staticpro, number of
+assigned lrecord types, etc...). The reloc_address field, which
+indicates at which address the file should be loaded if we want to
+avoid post-reload relocation, is set to 0. It then seeks to offset 256
+(base offset for the objects).
+
+\1f
+File: internals.info, Node: Data dumping, Next: Pointers dumping, Prev: The header, Up: Dumping phase
+
+Data dumping
+------------
+
+ The data is dumped in the same order as the addresses were allocated
+by `pdump_dump_data()', called from `pdump_scan_by_alignment()'. This
+function copies the data to a temporary buffer, relocates all pointers
+in the object to the addresses allocated in step Address Allocation,
+and writes it to the file. Using the same order means that, if we are
+careful with lrecords whose size is not a multiple of 4, we are ensured
+that the object is always written at the offset in the file allocated
+in step Address Allocation.
+
+\1f
+File: internals.info, Node: Pointers dumping, Prev: Data dumping, Up: Dumping phase
+
+Pointers dumping
+----------------
+
+ A bunch of tables needed to reassign properly the global pointers are
+then written. They are:
+
+ 1. the staticpro array
+
+ 2. the dumpstruct array
+
+ 3. the lrecord_implementation_table array
+
+ 4. a vector of all the offsets to the objects in the file that
+ include a description (for faster relocation at reload time)
+
+ 5. the pdump_wired and pdump_wired_list arrays
+
+ For each of the arrays we write both the pointer to the variables and
+the relocated offset of the object they point to. Since these variables
+are global, the pointers are still valid when restarting the program and
+are used to regenerate the global pointers.
+
+ The `pdump_wired_list' array is a special case. The variables it
+points to are the head of weak linked lists of lisp objects of the same
+type. Not all objects of this list are dumped so the relocated pointer
+we associate with them points to the first dumped object of the list, or
+Qnil if none is available. This is also the reason why they are not
+used as roots for the purpose of object enumeration.
+
+ This is the end of the dumping part.
+
+\1f
+File: internals.info, Node: Reloading phase, Next: Remaining issues, Prev: Dumping phase, Up: Dumping
+
+Reloading phase
+===============
+
+File loading
+------------
+
+ The file is mmap'ed in memory (which ensures a PAGESIZE alignment, at
+least 4096), or if mmap is unavailable or fails, a 256-bytes aligned
+malloc is done and the file is loaded.
+
+ Some variables are reinitialized from the values found in the header.
+
+ The difference between the actual loading address and the
+reloc_address is computed and will be used for all the relocations.
+
+Putting back the staticvec
+--------------------------
+
+ The staticvec array is memcpy'd from the file and the variables it
+points to are reset to the relocated objects addresses.
+
+Putting back the dumpstructed variables
+---------------------------------------
+
+ The variables pointed to by dumpstruct in the dump phase are reset to
+the right relocated object addresses.
+
+lrecord_implementations_table
+-----------------------------
+
+ The lrecord_implementations_table is reset to its dump time state and
+the right lrecord_type_index values are put in.
+
+Object relocation
+-----------------
+
+ All the objects are relocated using their description and their
+offset by `pdump_reloc_one'. This step is unnecessary if the
+reloc_address is equal to the file loading address.
+
+Putting back the pdump_wire and pdump_wire_list variables
+---------------------------------------------------------
+
+ Same as Putting back the dumpstructed variables.
+
+Reorganize the hash tables
+--------------------------
+
+ Since some of the hash values in the lisp hash tables are
+address-dependent, their layout is now wrong. So we go through each of
+them and have them resorted by calling `pdump_reorganize_hash_table'.
+
+\1f
+File: internals.info, Node: Remaining issues, Prev: Reloading phase, Up: Dumping
+
+Remaining issues
+================
+
+ The build process will have to start a post-dump xemacs, ask it the
+loading address (which will, hopefully, be always the same between
+different xemacs invocations) and relocate the file to the new address.
+This way the object relocation phase will not have to be done, which
+means no writes in the objects and that, because of the use of mmap, the
+dumped data will be shared between all the xemacs running on the
+computer.
+
+ Some executable signature will be necessary to ensure that a given
+dump file is really associated with a given executable, or random
+crashes will occur. Maybe a random number set at compile or configure
+time thru a define. This will also allow for having
+differently-compiled xemacsen on the same system (mule and no-mule
+comes to mind).
+
+ The DOC file contents should probably end up in the dump file.
+
+\1f
+File: internals.info, Node: Events and the Event Loop, Next: Evaluation; Stack Frames; Bindings, Prev: Dumping, Up: Top
Events and the Event Loop
*************************
* Converting Events::
* Dispatching Events; The Command Builder::
-\1f
-File: internals.info, Node: Introduction to Events, Next: Main Loop, Up: Events and the Event Loop
-
-Introduction to Events
-======================
-
- An event is an object that encapsulates information about an
-interesting occurrence in the operating system. Events are generated
-either by user action, direct (e.g. typing on the keyboard or moving
-the mouse) or indirect (moving another window, thereby generating an
-expose event on an Emacs frame), or as a result of some other typically
-asynchronous action happening, such as output from a subprocess being
-ready or a timer expiring. Events come into the system in an
-asynchronous fashion (typically through a callback being called) and
-are converted into a synchronous event queue (first-in, first-out) in a
-process that we will call "collection".
-
- Note that each application has its own event queue. (It is
-immaterial whether the collection process directly puts the events in
-the proper application's queue, or puts them into a single system
-queue, which is later split up.)
-
- The most basic level of event collection is done by the operating
-system or window system. Typically, XEmacs does its own event
-collection as well. Often there are multiple layers of collection in
-XEmacs, with events from various sources being collected into a queue,
-which is then combined with other sources to go into another queue
-(i.e. a second level of collection), with perhaps another level on top
-of this, etc.
-
- XEmacs has its own types of events (called "Emacs events"), which
-provides an abstract layer on top of the system-dependent nature of the
-most basic events that are received. Part of the complex nature of the
-XEmacs event collection process involves converting from the
-operating-system events into the proper Emacs events--there may not be
-a one-to-one correspondence.
-
- Emacs events are documented in `events.h'; I'll discuss them later.
-