git.chise.org Git - chise/xemacs-chise.git.1/blob - info/internals.info-5

   1 This is Info file ../../info/internals.info, produced by Makeinfo
   2 version 1.68 from the input file internals.texi.
   3
   4 INFO-DIR-SECTION XEmacs Editor
   5 START-INFO-DIR-ENTRY
   6 * Internals: (internals).       XEmacs Internals Manual.
   7 END-INFO-DIR-ENTRY
   8
   9    Copyright (C) 1992 - 1996 Ben Wing.  Copyright (C) 1996, 1997 Sun
  10 Microsystems.  Copyright (C) 1994 - 1998 Free Software Foundation.
  11 Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
  12
  13    Permission is granted to make and distribute verbatim copies of this
  14 manual provided the copyright notice and this permission notice are
  15 preserved on all copies.
  16
  17    Permission is granted to copy and distribute modified versions of
  18 this manual under the conditions for verbatim copying, provided that the
  19 entire resulting derived work is distributed under the terms of a
  20 permission notice identical to this one.
  21
  22    Permission is granted to copy and distribute translations of this
  23 manual into another language, under the above conditions for modified
  24 versions, except that this permission notice may be stated in a
  25 translation approved by the Foundation.
  26
  27    Permission is granted to copy and distribute modified versions of
  28 this manual under the conditions for verbatim copying, provided also
  29 that the section entitled "GNU General Public License" is included
  30 exactly as in the original, and provided that the entire resulting
  31 derived work is distributed under the terms of a permission notice
  32 identical to this one.
  33
  34    Permission is granted to copy and distribute translations of this
  35 manual into another language, under the above conditions for modified
  36 versions, except that the section entitled "GNU General Public License"
  37 may be included in a translation approved by the Free Software
  38 Foundation instead of in the original English.
  39
  40 \1f
  41 File: internals.info,  Node: garbage_collect_1,  Next: mark_object,  Prev: Invocation,  Up: Garbage Collection - Step by Step
  42
  43 `garbage_collect_1'
  44 -------------------
  45
  46    We can now describe exactly what happens after the invocation takes
  47 place.
  48   1. There are several cases in which the garbage collector is left
  49      immediately: when we are already garbage collecting
  50      (`gc_in_progress'), when the garbage collection is somehow
  51      forbidden (`gc_currently_forbidden'), when we are currently
  52      displaying something (`in_display') or when we are preparing for
  53      the armageddon of the whole system (`preparing_for_armageddon').
  54
  55   2. Next the correct frame in which to put all the output occurring
  56      during garbage collecting is determined. In order to be able to
  57      restore the old display's state after displaying the message, some
  58      data about the current cursor position has to be saved. The
  59      variables `pre_gc_curser' and `cursor_changed' take care of that.
  60
  61   3. The state of `gc_currently_forbidden' must be restored after the
  62      garbage collection, no matter what happens during the process. We
  63      accomplish this by `record_unwind_protect'ing the suitable function
  64      `restore_gc_inhibit' together with the current value of
  65      `gc_currently_forbidden'.
  66
  67   4. If we are concurrently running an interactive xemacs session, the
  68      next step is simply to show the garbage collector's cursor/message.
  69
  70   5. The following steps are the intrinsic steps of the garbage
  71      collector, therefore `gc_in_progress' is set.
  72
  73   6. For debugging purposes, it is possible to copy the current C stack
  74      frame. However, this seems to be a currently unused feature.
  75
  76   7. Before actually starting to go over all live objects, references to
  77      objects that are no longer used are pruned. We only have to do
  78      this for events (`clear_event_resource') and for specifiers
  79      (`cleanup_specifiers').
  80
  81   8. Now the mark phase begins and marks all accessible elements. In
  82      order to start from all slots that serve as roots of
  83      accessibility, the function `mark_object' is called for each root
  84      individually to go out from there to mark all reachable objects.
  85      All roots that are traversed are shown in their processed order:
  86         * all constant symbols and static variables that are registered
  87           via `staticpro' in the array `staticvec'.  *Note Adding
  88           Global Lisp Variables::.
  89
  90         * all Lisp objects that are created in C functions and that
  91           must be protected from freeing them. They are registered in
  92           the global list `gcprolist'.  *Note GCPROing::.
  93
  94         * all local variables (i.e. their name fields `symbol' and old
  95           values `old_values') that are bound during the evaluation by
  96           the Lisp engine. They are stored in `specbinding' structs
  97           pushed on a stack called `specpdl'.  *Note Dynamic Binding;
  98           The specbinding Stack; Unwind-Protects::.
  99
 100         * all catch blocks that the Lisp engine encounters during the
 101           evaluation cause the creation of structs `catchtag' inserted
 102           in the list `catchlist'. Their tag (`tag') and value (`val'
 103           fields are freshly created objects and therefore have to be
 104           marked.  *Note Catch and Throw::.
 105
 106         * every function application pushes new structs `backtrace' on
 107           the call stack of the Lisp engine (`backtrace_list'). The
 108           unique parts that have to be marked are the fields for each
 109           function (`function') and all their arguments (`args').
 110           *Note Evaluation::.
 111
 112         * all objects that are used by the redisplay engine that must
 113           not be freed are marked by a special function called
 114           `mark_redisplay' (in `redisplay.c').
 115
 116         * all objects created for profiling purposes are allocated by C
 117           functions instead of using the lisp allocation mechanisms. In
 118           order to receive the right ones during the sweep phase, they
 119           also have to be marked manually. That is done by the function
 120           `mark_profiling_info'
 121
 122   9. Hash tables in Xemacs belong to a kind of special objects that
 123      make use of a concept often called 'weak pointers'.  To make a
 124      long story short, these kind of pointers are not followed during
 125      the estimation of the live objects during garbage collection.  Any
 126      object referenced only by weak pointers is collected anyway, and
 127      the reference to it is cleared. In hash tables there are different
 128      usage patterns of them, manifesting in different types of hash
 129      tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak'
 130      (internally also 'key-car-weak' and 'value-car-weak') hash tables,
 131      each clearing entries depending on different conditions. More
 132      information can be found in the documentation to the function
 133      `make-hash-table'.
 134
 135      Because there are complicated dependency rules about when and what
 136      to mark while processing weak hash tables, the standard `marker'
 137      method is only active if it is marking non-weak hash tables. As
 138      soon as a weak component is in the table, the hash table entries
 139      are ignored while marking. Instead their marking is done each
 140      separately by the function `finish_marking_weak_hash_tables'. This
 141      function iterates over each hash table entry `hentries' for each
 142      weak hash table in `Vall_weak_hash_tables'. Depending on the type
 143      of a table, the appropriate action is performed.  If a table is
 144      acting as `HASH_TABLE_KEY_WEAK', and a key already marked,
 145      everything reachable from the `value' component is marked. If it is
 146      acting as a `HASH_TABLE_VALUE_WEAK' and the value component is
 147      already marked, the marking starts beginning only from the `key'
 148      component.  If it is a `HASH_TABLE_KEY_CAR_WEAK' and the car of
 149      the key entry is already marked, we mark both the `key' and
 150      `value' components.  Finally, if the table is of the type
 151      `HASH_TABLE_VALUE_CAR_WEAK' and the car of the value components is
 152      already marked, again both the `key' and the `value' components
 153      get marked.
 154
 155      Again, there are lists with comparable properties called weak
 156      lists. There exist different peculiarities of their types called
 157      `simple', `assoc', `key-assoc' and `value-assoc'. You can find
 158      further details about them in the description to the function
 159      `make-weak-list'. The scheme of their marking is similar: all weak
 160      lists are listed in `Qall_weak_lists', therefore we iterate over
 161      them. The marking is advanced until we hit an already marked pair.
 162      Then we know that during a former run all the rest has been marked
 163      completely. Again, depending on the special type of the weak list,
 164      our jobs differ. If it is a `WEAK_LIST_SIMPLE' and the elem is
 165      marked, we mark the `cons' part. If it is a `WEAK_LIST_ASSOC' and
 166      not a pair or a pair with both marked car and cdr, we mark the
 167      `cons' and the `elem'. If it is a `WEAK_LIST_KEY_ASSOC' and not a
 168      pair or a pair with a marked car of the elem, we mark the `cons'
 169      and the `elem'. Finally, if it is a `WEAK_LIST_VALUE_ASSOC' and
 170      not a pair or a pair with a marked cdr of the elem, we mark both
 171      the `cons' and the `elem'.
 172
 173      Since, by marking objects in reach from weak hash tables and weak
 174      lists, other objects could get marked, this perhaps implies
 175      further marking of other weak objects, both finishing functions
 176      are redone as long as yet unmarked objects get freshly marked.
 177
 178  10. After completing the special marking for the weak hash tables and
 179      for the weak lists, all entries that point to objects that are
 180      going to be swept in the further process are useless, and
 181      therefore have to be removed from the table or the list.
 182
 183      The function `prune_weak_hash_tables' does the job for weak hash
 184      tables. Totally unmarked hash tables are removed from the list
 185      `Vall_weak_hash_tables'. The other ones are treated more carefully
 186      by scanning over all entries and removing one as soon as one of
 187      the components `key' and `value' is unmarked.
 188
 189      The same idea applies to the weak lists. It is accomplished by
 190      `prune_weak_lists': An unmarked list is pruned from
 191      `Vall_weak_lists' immediately. A marked list is treated more
 192      carefully by going over it and removing just the unmarked pairs.
 193
 194  11. The function `prune_specifiers' checks all listed specifiers held
 195      in `Vall_speficiers' and removes the ones from the lists that are
 196      unmarked.
 197
 198  12. All syntax tables are stored in a list called
 199      `Vall_syntax_tables'. The function `prune_syntax_tables' walks
 200      through it and unlinks the tables that are unmarked.
 201
 202  13. Next, we will attack the complete sweeping - the function
 203      `gc_sweep' which holds the predominance.
 204
 205  14. First, all the variables with respect to garbage collection are
 206      reset. `consing_since_gc' - the counter of the created cells since
 207      the last garbage collection - is set back to 0, and
 208      `gc_in_progress' is not `true' anymore.
 209
 210  15. In case the session is interactive, the displayed cursor and
 211      message are removed again.
 212
 213  16. The state of `gc_inhibit' is restored to the former value by
 214      unwinding the stack.
 215
 216  17. A small memory reserve is always held back that can be reached by
 217      `breathing_space'. If nothing more is left, we create a new reserve
 218      and exit.
 219
 220 \1f
 221 File: internals.info,  Node: mark_object,  Next: gc_sweep,  Prev: garbage_collect_1,  Up: Garbage Collection - Step by Step
 222
 223 `mark_object'
 224 -------------
 225
 226    The first thing that is checked while marking an object is whether
 227 the object is a real Lisp object `Lisp_Type_Record' or just an integer
 228 or a character. Integers and characters are the only two types that are
 229 stored directly - without another level of indirection, and therefore
 230 they don´t have to be marked and collected.  *Note How Lisp Objects Are
 231 Represented in C::.
 232
 233    The second case is the one we have to handle. It is the one when we
 234 are dealing with a pointer to a Lisp object. But, there exist also three
 235 possibilities, that prevent us from doing anything while marking: The
 236 object is read only which prevents it from being garbage collected,
 237 i.e. marked (`C_READONLY_RECORD_HEADER'). The object in question is
 238 already marked, and need not be marked for the second time (checked by
 239 `MARKED_RECORD_HEADER_P'). If it is a special, unmarkable object
 240 (`UNMARKABLE_RECORD_HEADER_P', apparently, these are objects that sit
 241 in some CONST space, and can therefore not be marked, see
 242 `this_one_is_unmarkable' in `alloc.c').
 243
 244    Now, the actual marking is feasible. We do so by once using the macro
 245 `MARK_RECORD_HEADER' to mark the object itself (actually the special
 246 flag in the lrecord header), and calling its special marker "method"
 247 `marker' if available. The marker method marks every other object that
 248 is in reach from our current object. Note, that these marker methods
 249 should not call `mark_object' recursively, but instead should return
 250 the next object from where further marking has to be performed.
 251
 252    In case another object was returned, as mentioned before, we
 253 reiterate the whole `mark_object' process beginning with this next
 254 object.
 255
 256 \1f
 257 File: internals.info,  Node: gc_sweep,  Next: sweep_lcrecords_1,  Prev: mark_object,  Up: Garbage Collection - Step by Step
 258
 259 `gc_sweep'
 260 ----------
 261
 262    The job of this function is to free all unmarked records from
 263 memory. As we know, there are different types of objects implemented
 264 and managed, and consequently different ways to free them from memory.
 265 *Note Introduction to Allocation::.
 266
 267    We start with all objects stored through `lcrecords'. All bulkier
 268 objects are allocated and handled using that scheme of `lcrecords'.
 269 Each object is `malloc'ed separately instead of placing it in one of
 270 the contiguous frob blocks. All types that are currently stored using
 271 `lcrecords'´s  `alloc_lcrecord' and `make_lcrecord_list' are the types:
 272 vectors, buffers, char-table, char-table-entry, console, weak-list,
 273 database, device, ldap, hash-table, command-builder, extent-auxiliary,
 274 extent-info, face, coding-system, frame, image-instance, glyph,
 275 popup-data, gui-item, keymap, charset, color_instance, font_instance,
 276 opaque, opaque-list, process, range-table, specifier,
 277 symbol-value-buffer-local, symbol-value-lisp-magic,
 278 symbol-value-varalias, toolbar-button, tooltalk-message,
 279 tooltalk-pattern, window, and window-configuration. We take care of
 280 them in the fist place in order to be able to handle and to finalize
 281 items stored in them more easily. The function `sweep_lcrecords_1' as
 282 described below is doing the whole job for us.  For a description about
 283 the internals: *Note lrecords::.
 284
 285    Our next candidates are the other objects that behave quite
 286 differently than everything else: the strings. They consists of two
 287 parts, a fixed-size portion (`struct Lisp_string') holding the string's
 288 length, its property list and a pointer to the second part, and the
 289 actual string data, which is stored in string-chars blocks comparable to
 290 frob blocks. In this block, the data is not only freed, but also a
 291 compression of holes is made, i.e. all strings are relocated together.
 292 *Note String::. This compacting phase is performed by the function
 293 `compact_string_chars', the actual sweeping by the function
 294 `sweep_strings' is described below.
 295
 296    After that, the other types are swept step by step using functions
 297 `sweep_conses', `sweep_bit_vectors_1', `sweep_compiled_functions',
 298 `sweep_floats', `sweep_symbols', `sweep_extents', `sweep_markers' and
 299 `sweep_extents'.  They are the fixed-size types cons, floats,
 300 compiled-functions, symbol, marker, extent, and event stored in
 301 so-called "frob blocks", and therefore we can basically do the same on
 302 every type objects, using the same macros, especially defined only to
 303 handle everything with respect to fixed-size blocks. The only fixed-size
 304 type that is not handled here are the fixed-size portion of strings,
 305 because we took special care of them earlier.
 306
 307    The only big exceptions are bit vectors stored differently and
 308 therefore treated differently by the function `sweep_bit_vectors_1'
 309 described later.
 310
 311    At first, we need some brief information about how these fixed-size
 312 types are managed in general, in order to understand how the sweeping
 313 is done. They have all a fixed size, and are therefore stored in big
 314 blocks of memory - allocated at once - that can hold a certain amount
 315 of objects of one type. The macro `DECLARE_FIXED_TYPE_ALLOC' creates
 316 the suitable structures for every type. More precisely, we have the
 317 block struct (holding a pointer to the previous block `prev' and the
 318 objects in `block[]'), a pointer to current block
 319 (`current_..._block)') and its last index (`current_..._block_index'),
 320 and a pointer to the free list that will be created. Also a macro
 321 `FIXED_TYPE_FROM_BLOCK' plus some related macros exists that are used
 322 to obtain a new object, either from the free list
 323 `ALLOCATE_FIXED_TYPE_1' if there is an unused object of that type
 324 stored or by allocating a completely new block using
 325 `ALLOCATE_FIXED_TYPE_FROM_BLOCK'.
 326
 327    The rest works as follows: all of them define a macro `UNMARK_...'
 328 that is used to unmark the object. They define a macro
 329 `ADDITIONAL_FREE_...' that defines additional work that has to be done
 330 when converting an object from in use to not in use (so far, only
 331 markers use it in order to unchain them). Then, they all call the macro
 332 `SWEEP_FIXED_TYPE_BLOCK' instantiated with their type name and their
 333 struct name.
 334
 335    This call in particular does the following: we go over all blocks
 336 starting with the current moving towards the oldest.  For each block,
 337 we look at every object in it. If the object already freed (checked
 338 with `FREE_STRUCT_P' using the first pointer of the object), or if it is
 339 set to read only (`C_READONLY_RECORD_HEADER_P', nothing must be done.
 340 If it is unmarked (checked with `MARKED_RECORD_HEADER_P'), it is put in
 341 the free list and set free (using the macro `FREE_FIXED_TYPE',
 342 otherwise it stays in the block, but is unmarked (by `UNMARK_...').
 343 While going through one block, we note if the whole block is empty. If
 344 so, the whole block is freed (using `xfree') and the free list state is
 345 set to the state it had before handling this block.
 346
 347 \1f
 348 File: internals.info,  Node: sweep_lcrecords_1,  Next: compact_string_chars,  Prev: gc_sweep,  Up: Garbage Collection - Step by Step
 349
 350 `sweep_lcrecords_1'
 351 -------------------
 352
 353    After nullifying the complete lcrecord statistics, we go over all
 354 lcrecords two separate times. They are all chained together in a list
 355 with a head called `all_lcrecords'.
 356
 357    The first loop calls for each object its `finalizer' method, but only
 358 in the case that it is not read only (`C_READONLY_RECORD_HEADER_P)', it
 359 is not already marked (`MARKED_RECORD_HEADER_P'), it is not already in
 360 a free list (list of freed objects, field `free') and finally it owns a
 361 finalizer method.
 362
 363    The second loop actually frees the appropriate objects again by
 364 iterating through the whole list. In case an object is read only or
 365 marked, it has to persist, otherwise it is manually freed by calling
 366 `xfree'. During this loop, the lcrecord statistics are kept up to date
 367 by calling `tick_lcrecord_stats' with the right arguments,
 368
 369 \1f
 370 File: internals.info,  Node: compact_string_chars,  Next: sweep_strings,  Prev: sweep_lcrecords_1,  Up: Garbage Collection - Step by Step
 371
 372 `compact_string_chars'
 373 ----------------------
 374
 375    The purpose of this function is to compact all the data parts of the
 376 strings that are held in so-called `string_chars_block', i.e. the
 377 strings that do not exceed a certain maximal length.
 378
 379    The procedure with which this is done is as follows. We are keeping
 380 two positions in the `string_chars_block's using two pointer/integer
 381 pairs, namely `from_sb'/`from_pos' and `to_sb'/`to_pos'. They stand for
 382 the actual positions, from where to where, to copy the actually handled
 383 string.
 384
 385    While going over all chained `string_char_block's and their held
 386 strings, staring at `first_string_chars_block', both pointers are
 387 advanced and eventually a string is copied from `from_sb' to `to_sb',
 388 depending on the status of the pointed at strings.
 389
 390    More precisely, we can distinguish between the following actions.
 391    * The string at `from_sb''s position could be marked as free, which
 392      is indicated by an invalid pointer to the pointer that should
 393      point back to the fixed size string object, and which is checked by
 394      `FREE_STRUCT_P'. In this case, the `from_sb'/`from_pos' is
 395      advanced to the next string, and nothing has to be copied.
 396
 397    * Also, if a string object itself is unmarked, nothing has to be
 398      copied. We likewise advance the `from_sb'/`from_pos' pair as
 399      described above.
 400
 401    * In all other cases, we have a marked string at hand. The string
 402      data must be moved from the from-position to the to-position. In
 403      case there is not enough space in the actual `to_sb'-block, we
 404      advance this pointer to the beginning of the next block before
 405      copying. In case the from and to positions are different, we
 406      perform the actual copying using the library function `memmove'.
 407
 408    After compacting, the pointer to the current `string_chars_block',
 409 sitting in `current_string_chars_block', is reset on the last block to
 410 which we moved a string, i.e. `to_block', and all remaining blocks (we
 411 know that they just carry garbage) are explicitly `xfree'd.
 412
 413 \1f
 414 File: internals.info,  Node: sweep_strings,  Next: sweep_bit_vectors_1,  Prev: compact_string_chars,  Up: Garbage Collection - Step by Step
 415
 416 `sweep_strings'
 417 ---------------
 418
 419    The sweeping for the fixed sized string objects is essentially
 420 exactly the same as it is for all other fixed size types. As before,
 421 the freeing into the suitable free list is done by using the macro
 422 `SWEEP_FIXED_SIZE_BLOCK' after defining the right macros
 423 `UNMARK_string' and `ADDITIONAL_FREE_string'. These two definitions are
 424 a little bit special compared to the ones used for the other fixed size
 425 types.
 426
 427    `UNMARK_string' is defined the same way except some additional code
 428 used for updating the bookkeeping information.
 429
 430    For strings, `ADDITIONAL_FREE_string' has to do something in
 431 addition: in case, the string was not allocated in a
 432 `string_chars_block' because it exceeded the maximal length, and
 433 therefore it was `malloc'ed separately, we know also `xfree' it
 434 explicitly.
 435
 436 \1f
 437 File: internals.info,  Node: sweep_bit_vectors_1,  Prev: sweep_strings,  Up: Garbage Collection - Step by Step
 438
 439 `sweep_bit_vectors_1'
 440 ---------------------
 441
 442    Bit vectors are also one of the rare types that are `malloc'ed
 443 individually. Consequently, while sweeping, all further needless bit
 444 vectors must be freed by hand. This is done, as one might imagine, the
 445 expected way: since they are all registered in a list called
 446 `all_bit_vectors', all elements of that list are traversed, all
 447 unmarked bit vectors are unlinked by calling `xfree' and all of them
 448 become unmarked.  In addition, the bookkeeping information used for
 449 garbage collector's output purposes is updated.
 450
 451 \1f
 452 File: internals.info,  Node: Integers and Characters,  Next: Allocation from Frob Blocks,  Prev: Garbage Collection - Step by Step,  Up: Allocation of Objects in XEmacs Lisp
 453
 454 Integers and Characters
 455 =======================
 456
 457    Integer and character Lisp objects are created from integers using
 458 the macros `XSETINT()' and `XSETCHAR()' or the equivalent functions
 459 `make_int()' and `make_char()'. (These are actually macros on most
 460 systems.)  These functions basically just do some moving of bits
 461 around, since the integral value of the object is stored directly in
 462 the `Lisp_Object'.
 463
 464    `XSETINT()' and the like will truncate values given to them that are
 465 too big; i.e. you won't get the value you expected but the tag bits
 466 will at least be correct.
 467
 468 \1f
 469 File: internals.info,  Node: Allocation from Frob Blocks,  Next: lrecords,  Prev: Integers and Characters,  Up: Allocation of Objects in XEmacs Lisp
 470
 471 Allocation from Frob Blocks
 472 ===========================
 473
 474    The uninitialized memory required by a `Lisp_Object' of a particular
 475 type is allocated using `ALLOCATE_FIXED_TYPE()'.  This only occurs
 476 inside of the lowest-level object-creating functions in `alloc.c':
 477 `Fcons()', `make_float()', `Fmake_byte_code()', `Fmake_symbol()',
 478 `allocate_extent()', `allocate_event()', `Fmake_marker()', and
 479 `make_uninit_string()'.  The idea is that, for each type, there are a
 480 number of frob blocks (each 2K in size); each frob block is divided up
 481 into object-sized chunks.  Each frob block will have some of these
 482 chunks that are currently assigned to objects, and perhaps some that are
 483 free. (If a frob block has nothing but free chunks, it is freed at the
 484 end of the garbage collection cycle.)  The free chunks are stored in a
 485 free list, which is chained by storing a pointer in the first four bytes
 486 of the chunk. (Except for the free chunks at the end of the last frob
 487 block, which are handled using an index which points past the end of the
 488 last-allocated chunk in the last frob block.)  `ALLOCATE_FIXED_TYPE()'
 489 first tries to retrieve a chunk from the free list; if that fails, it
 490 calls `ALLOCATE_FIXED_TYPE_FROM_BLOCK()', which looks at the end of the
 491 last frob block for space, and creates a new frob block if there is
 492 none. (There are actually two versions of these macros, one of which is
 493 more defensive but less efficient and is used for error-checking.)
 494
 495 \1f
 496 File: internals.info,  Node: lrecords,  Next: Low-level allocation,  Prev: Allocation from Frob Blocks,  Up: Allocation of Objects in XEmacs Lisp
 497
 498 lrecords
 499 ========
 500
 501    [see `lrecord.h']
 502
 503    All lrecords have at the beginning of their structure a `struct
 504 lrecord_header'.  This just contains a pointer to a `struct
 505 lrecord_implementation', which is a structure containing method pointers
 506 and such.  There is one of these for each type, and it is a global,
 507 constant, statically-declared structure that is declared in the
 508 `DEFINE_LRECORD_IMPLEMENTATION()' macro. (This macro actually declares
 509 an array of two `struct lrecord_implementation' structures.  The first
 510 one contains all the standard method pointers, and is used in all
 511 normal circumstances.  During garbage collection, however, the lrecord
 512 is "marked" by bumping its implementation pointer by one, so that it
 513 points to the second structure in the array.  This structure contains a
 514 special indication in it that it's a "marked-object" structure: the
 515 finalize method is the special function `this_marks_a_marked_record()',
 516 and all other methods are null pointers.  At the end of garbage
 517 collection, all lrecords will either be reclaimed or unmarked by
 518 decrementing their implementation pointers, so this second structure
 519 pointer will never remain past garbage collection.
 520
 521    Simple lrecords (of type (c) above) just have a `struct
 522 lrecord_header' at their beginning.  lcrecords, however, actually have a
 523 `struct lcrecord_header'.  This, in turn, has a `struct lrecord_header'
 524 at its beginning, so sanity is preserved; but it also has a pointer
 525 used to chain all lcrecords together, and a special ID field used to
 526 distinguish one lcrecord from another. (This field is used only for
 527 debugging and could be removed, but the space gain is not significant.)
 528
 529    Simple lrecords are created using `ALLOCATE_FIXED_TYPE()', just like
 530 for other frob blocks.  The only change is that the implementation
 531 pointer must be initialized correctly. (The implementation structure for
 532 an lrecord, or rather the pointer to it, is named `lrecord_float',
 533 `lrecord_extent', `lrecord_buffer', etc.)
 534
 535    lcrecords are created using `alloc_lcrecord()'.  This takes a size
 536 to allocate and an implementation pointer. (The size needs to be passed
 537 because some lcrecords, such as window configurations, are of variable
 538 size.) This basically just `malloc()'s the storage, initializes the
 539 `struct lcrecord_header', and chains the lcrecord onto the head of the
 540 list of all lcrecords, which is stored in the variable `all_lcrecords'.
 541 The calls to `alloc_lcrecord()' generally occur in the lowest-level
 542 allocation function for each lrecord type.
 543
 544    Whenever you create an lrecord, you need to call either
 545 `DEFINE_LRECORD_IMPLEMENTATION()' or
 546 `DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()'.  This needs to be specified
 547 in a C file, at the top level.  What this actually does is define and
 548 initialize the implementation structure for the lrecord. (And possibly
 549 declares a function `error_check_foo()' that implements the `XFOO()'
 550 macro when error-checking is enabled.)  The arguments to the macros are
 551 the actual type name (this is used to construct the C variable name of
 552 the lrecord implementation structure and related structures using the
 553 `##' macro concatenation operator), a string that names the type on the
 554 Lisp level (this may not be the same as the C type name; typically, the
 555 C type name has underscores, while the Lisp string has dashes), various
 556 method pointers, and the name of the C structure that contains the
 557 object.  The methods are used to encapsulate type-specific information
 558 about the object, such as how to print it or mark it for garbage
 559 collection, so that it's easy to add new object types without having to
 560 add a specific case for each new type in a bunch of different places.
 561
 562    The difference between `DEFINE_LRECORD_IMPLEMENTATION()' and
 563 `DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()' is that the former is used
 564 for fixed-size object types and the latter is for variable-size object
 565 types.  Most object types are fixed-size; some complex types, however
 566 (e.g. window configurations), are variable-size.  Variable-size object
 567 types have an extra method, which is called to determine the actual
 568 size of a particular object of that type.  (Currently this is only used
 569 for keeping allocation statistics.)
 570
 571    For the purpose of keeping allocation statistics, the allocation
 572 engine keeps a list of all the different types that exist.  Note that,
 573 since `DEFINE_LRECORD_IMPLEMENTATION()' is a macro that is specified at
 574 top-level, there is no way for it to add to the list of all existing
 575 types.  What happens instead is that each implementation structure
 576 contains in it a dynamically assigned number that is particular to that
 577 type. (Or rather, it contains a pointer to another structure that
 578 contains this number.  This evasiveness is done so that the
 579 implementation structure can be declared const.) In the sweep stage of
 580 garbage collection, each lrecord is examined to see if its
 581 implementation structure has its dynamically-assigned number set.  If
 582 not, it must be a new type, and it is added to the list of known types
 583 and a new number assigned.  The number is used to index into an array
 584 holding the number of objects of each type and the total memory
 585 allocated for objects of that type.  The statistics in this array are
 586 also computed during the sweep stage.  These statistics are returned by
 587 the call to `garbage-collect' and are printed out at the end of the
 588 loadup phase.
 589
 590    Note that for every type defined with a `DEFINE_LRECORD_*()' macro,
 591 there needs to be a `DECLARE_LRECORD_IMPLEMENTATION()' somewhere in a
 592 `.h' file, and this `.h' file needs to be included by `inline.c'.
 593
 594    Furthermore, there should generally be a set of `XFOOBAR()',
 595 `FOOBARP()', etc. macros in a `.h' (or occasionally `.c') file.  To
 596 create one of these, copy an existing model and modify as necessary.
 597
 598    The various methods in the lrecord implementation structure are:
 599
 600   1. A "mark" method.  This is called during the marking stage and
 601      passed a function pointer (usually the `mark_object()' function),
 602      which is used to mark an object.  All Lisp objects that are
 603      contained within the object need to be marked by applying this
 604      function to them.  The mark method should also return a Lisp
 605      object, which should be either nil or an object to mark. (This can
 606      be used in lieu of calling `mark_object()' on the object, to
 607      reduce the recursion depth, and consequently should be the most
 608      heavily nested sub-object, such as a long list.)
 609
 610      *Please note:* When the mark method is called, garbage collection
 611      is in progress, and special precautions need to be taken when
 612      accessing objects; see section (B) above.
 613
 614      If your mark method does not need to do anything, it can be `NULL'.
 615
 616   2. A "print" method.  This is called to create a printed
 617      representation of the object, whenever `princ', `prin1', or the
 618      like is called.  It is passed the object, a stream to which the
 619      output is to be directed, and an `escapeflag' which indicates
 620      whether the object's printed representation should be "escaped" so
 621      that it is readable. (This corresponds to the difference between
 622      `princ' and `prin1'.) Basically, "escaped" means that strings will
 623      have quotes around them and confusing characters in the strings
 624      such as quotes, backslashes, and newlines will be backslashed; and
 625      that special care will be taken to make symbols print in a
 626      readable fashion (e.g. symbols that look like numbers will be
 627      backslashed).  Other readable objects should perhaps pass
 628      `escapeflag' on when sub-objects are printed, so that readability
 629      is preserved when necessary (or if not, always pass in a 1 for
 630      `escapeflag').  Non-readable objects should in general ignore
 631      `escapeflag', except that some use it as an indication that more
 632      verbose output should be given.
 633
 634      Sub-objects are printed using `print_internal()', which takes
 635      exactly the same arguments as are passed to the print method.
 636
 637      Literal C strings should be printed using `write_c_string()', or
 638      `write_string_1()' for non-null-terminated strings.
 639
 640      Functions that do not have a readable representation should check
 641      the `print_readably' flag and signal an error if it is set.
 642
 643      If you specify NULL for the print method, the
 644      `default_object_printer()' will be used.
 645
 646   3. A "finalize" method.  This is called at the beginning of the sweep
 647      stage on lcrecords that are about to be freed, and should be used
 648      to perform any extra object cleanup.  This typically involves
 649      freeing any extra `malloc()'ed memory associated with the object,
 650      releasing any operating-system and window-system resources
 651      associated with the object (e.g. pixmaps, fonts), etc.
 652
 653      The finalize method can be NULL if nothing needs to be done.
 654
 655      WARNING #1: The finalize method is also called at the end of the
 656      dump phase; this time with the for_disksave parameter set to
 657      non-zero.  The object is *not* about to disappear, so you have to
 658      make sure to *not* free any extra `malloc()'ed memory if you're
 659      going to need it later.  (Also, signal an error if there are any
 660      operating-system and window-system resources here, because they
 661      can't be dumped.)
 662
 663      Finalize methods should, as a rule, set to zero any pointers after
 664      they've been freed, and check to make sure pointers are not zero
 665      before freeing.  Although I'm pretty sure that finalize methods
 666      are not called twice on the same object (except for the
 667      `for_disksave' proviso), we've gotten nastily burned in some cases
 668      by not doing this.
 669
 670      WARNING #2: The finalize method is *only* called for lcrecords,
 671      *not* for simply lrecords.  If you need a finalize method for
 672      simple lrecords, you have to stick it in the
 673      `ADDITIONAL_FREE_foo()' macro in `alloc.c'.
 674
 675      WARNING #3: Things are in an *extremely* bizarre state when
 676      `ADDITIONAL_FREE_foo()' is called, so you have to be incredibly
 677      careful when writing one of these functions.  See the comment in
 678      `gc_sweep()'.  If you ever have to add one of these, consider
 679      using an lcrecord or dealing with the problem in a different
 680      fashion.
 681
 682   4. An "equal" method.  This compares the two objects for similarity,
 683      when `equal' is called.  It should compare the contents of the
 684      objects in some reasonable fashion.  It is passed the two objects
 685      and a "depth" value, which is used to catch circular objects.  To
 686      compare sub-Lisp-objects, call `internal_equal()' and bump the
 687      depth value by one.  If this value gets too high, a
 688      `circular-object' error will be signaled.
 689
 690      If this is NULL, objects are `equal' only when they are `eq', i.e.
 691      identical.
 692
 693   5. A "hash" method.  This is used to hash objects when they are to be
 694      compared with `equal'.  The rule here is that if two objects are
 695      `equal', they *must* hash to the same value; i.e. your hash
 696      function should use some subset of the sub-fields of the object
 697      that are compared in the "equal" method.  If you specify this
 698      method as `NULL', the object's pointer will be used as the hash,
 699      which will *fail* if the object has an `equal' method, so don't do
 700      this.
 701
 702      To hash a sub-Lisp-object, call `internal_hash()'.  Bump the depth
 703      by one, just like in the "equal" method.
 704
 705      To convert a Lisp object directly into a hash value (using its
 706      pointer), use `LISP_HASH()'.  This is what happens when the hash
 707      method is NULL.
 708
 709      To hash two or more values together into a single value, use
 710      `HASH2()', `HASH3()', `HASH4()', etc.
 711
 712   6. "getprop", "putprop", "remprop", and "plist" methods.  These are
 713      used for object types that have properties.  I don't feel like
 714      documenting them here.  If you create one of these objects, you
 715      have to use different macros to define them, i.e.
 716      `DEFINE_LRECORD_IMPLEMENTATION_WITH_PROPS()' or
 717      `DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION_WITH_PROPS()'.
 718
 719   7. A "size_in_bytes" method, when the object is of variable-size.
 720      (i.e. declared with a `_SEQUENCE_IMPLEMENTATION' macro.)  This
 721      should simply return the object's size in bytes, exactly as you
 722      might expect.  For an example, see the methods for window
 723      configurations and opaques.
 724
 725 \1f
 726 File: internals.info,  Node: Low-level allocation,  Next: Pure Space,  Prev: lrecords,  Up: Allocation of Objects in XEmacs Lisp
 727
 728 Low-level allocation
 729 ====================
 730
 731    Memory that you want to allocate directly should be allocated using
 732 `xmalloc()' rather than `malloc()'.  This implements error-checking on
 733 the return value, and once upon a time did some more vital stuff (i.e.
 734 `BLOCK_INPUT', which is no longer necessary).  Free using `xfree()',
 735 and realloc using `xrealloc()'.  Note that `xmalloc()' will do a
 736 non-local exit if the memory can't be allocated. (Many functions,
 737 however, do not expect this, and thus XEmacs will likely crash if this
 738 happens.  *This is a bug.*  If you can, you should strive to make your
 739 function handle this OK.  However, it's difficult in the general
 740 circumstance, perhaps requiring extra unwind-protects and such.)
 741
 742    Note that XEmacs provides two separate replacements for the standard
 743 `malloc()' library function.  These are called "old GNU malloc"
 744 (`malloc.c') and "new GNU malloc" (`gmalloc.c'), respectively.  New GNU
 745 malloc is better in pretty much every way than old GNU malloc, and
 746 should be used if possible.  (It used to be that on some systems, the
 747 old one worked but the new one didn't.  I think this was due
 748 specifically to a bug in SunOS, which the new one now works around; so
 749 I don't think the old one ever has to be used any more.) The primary
 750 difference between both of these mallocs and the standard system malloc
 751 is that they are much faster, at the expense of increased space.  The
 752 basic idea is that memory is allocated in fixed chunks of powers of
 753 two.  This allows for basically constant malloc time, since the various
 754 chunks can just be kept on a number of free lists. (The standard system
 755 malloc typically allocates arbitrary-sized chunks and has to spend some
 756 time, sometimes a significant amount of time, walking the heap looking
 757 for a free block to use and cleaning things up.)  The new GNU malloc
 758 improves on things by allocating large objects in chunks of 4096 bytes
 759 rather than in ever larger powers of two, which results in ever larger
 760 wastage.  There is a slight speed loss here, but it's of doubtful
 761 significance.
 762
 763    NOTE: Apparently there is a third-generation GNU malloc that is
 764 significantly better than the new GNU malloc, and should probably be
 765 included in XEmacs.
 766
 767    There is also the relocating allocator, `ralloc.c'.  This actually
 768 moves blocks of memory around so that the `sbrk()' pointer shrunk and
 769 virtual memory released back to the system.  On some systems, this is a
 770 big win.  On all systems, it causes a noticeable (and sometimes huge)
 771 speed penalty, so I turn it off by default.  `ralloc.c' only works with
 772 the new GNU malloc in `gmalloc.c'.  There are also two versions of
 773 `ralloc.c', one that uses `mmap()' rather than block copies to move
 774 data around.  This purports to be faster, although that depends on the
 775 amount of data that would have had to be block copied and the
 776 system-call overhead for `mmap()'.  I don't know exactly how this
 777 works, except that the relocating-allocation routines are pretty much
 778 used only for the memory allocated for a buffer, which is the biggest
 779 consumer of space, esp. of space that may get freed later.
 780
 781    Note that the GNU mallocs have some "memory warning" facilities.
 782 XEmacs taps into them and issues a warning through the standard warning
 783 system, when memory gets to 75%, 85%, and 95% full.  (On some systems,
 784 the memory warnings are not functional.)
 785
 786    Allocated memory that is going to be used to make a Lisp object is
 787 created using `allocate_lisp_storage()'.  This calls `xmalloc()' but
 788 also verifies that the pointer to the memory can fit into a Lisp word
 789 (remember that some bits are taken away for a type tag and a mark bit).
 790 If not, an error is issued through `memory_full()'.
 791 `allocate_lisp_storage()' is called by `alloc_lcrecord()',
 792 `ALLOCATE_FIXED_TYPE()', and the vector and bit-vector creation
 793 routines.  These routines also call `INCREMENT_CONS_COUNTER()' at the
 794 appropriate times; this keeps statistics on how much memory is
 795 allocated, so that garbage-collection can be invoked when the threshold
 796 is reached.
 797
 798 \1f
 799 File: internals.info,  Node: Pure Space,  Next: Cons,  Prev: Low-level allocation,  Up: Allocation of Objects in XEmacs Lisp
 800
 801 Pure Space
 802 ==========
 803
 804    Not yet documented.
 805
 806 \1f
 807 File: internals.info,  Node: Cons,  Next: Vector,  Prev: Pure Space,  Up: Allocation of Objects in XEmacs Lisp
 808
 809 Cons
 810 ====
 811
 812    Conses are allocated in standard frob blocks.  The only thing to
 813 note is that conses can be explicitly freed using `free_cons()' and
 814 associated functions `free_list()' and `free_alist()'.  This
 815 immediately puts the conses onto the cons free list, and decrements the
 816 statistics on memory allocation appropriately.  This is used to good
 817 effect by some extremely commonly-used code, to avoid generating extra
 818 objects and thereby triggering GC sooner.  However, you have to be
 819 *extremely* careful when doing this.  If you mess this up, you will get
 820 BADLY BURNED, and it has happened before.
 821
 822 \1f
 823 File: internals.info,  Node: Vector,  Next: Bit Vector,  Prev: Cons,  Up: Allocation of Objects in XEmacs Lisp
 824
 825 Vector
 826 ======
 827
 828    As mentioned above, each vector is `malloc()'ed individually, and
 829 all are threaded through the variable `all_vectors'.  Vectors are
 830 marked strangely during garbage collection, by kludging the size field.
 831 Note that the `struct Lisp_Vector' is declared with its `contents'
 832 field being a *stretchy* array of one element.  It is actually
 833 `malloc()'ed with the right size, however, and access to any element
 834 through the `contents' array works fine.
 835
 836 \1f
 837 File: internals.info,  Node: Bit Vector,  Next: Symbol,  Prev: Vector,  Up: Allocation of Objects in XEmacs Lisp
 838
 839 Bit Vector
 840 ==========
 841
 842    Bit vectors work exactly like vectors, except for more complicated
 843 code to access an individual bit, and except for the fact that bit
 844 vectors are lrecords while vectors are not. (The only difference here is
 845 that there's an lrecord implementation pointer at the beginning and the
 846 tag field in bit vector Lisp words is "lrecord" rather than "vector".)
 847
 848 \1f
 849 File: internals.info,  Node: Symbol,  Next: Marker,  Prev: Bit Vector,  Up: Allocation of Objects in XEmacs Lisp
 850
 851 Symbol
 852 ======
 853
 854    Symbols are also allocated in frob blocks.  Note that the code
 855 exists for symbols to be either lrecords (category (c) above) or simple
 856 types (category (b) above), and are lrecords by default (I think),
 857 although there is no good reason for this.
 858
 859    Note that symbols in the awful horrible obarray structure are
 860 chained through their `next' field.
 861
 862    Remember that `intern' looks up a symbol in an obarray, creating one
 863 if necessary.
 864
 865 \1f
 866 File: internals.info,  Node: Marker,  Next: String,  Prev: Symbol,  Up: Allocation of Objects in XEmacs Lisp
 867
 868 Marker
 869 ======
 870
 871    Markers are allocated in frob blocks, as usual.  They are kept in a
 872 buffer unordered, but in a doubly-linked list so that they can easily
 873 be removed. (Formerly this was a singly-linked list, but in some cases
 874 garbage collection took an extraordinarily long time due to the O(N^2)
 875 time required to remove lots of markers from a buffer.) Markers are
 876 removed from a buffer in the finalize stage, in
 877 `ADDITIONAL_FREE_marker()'.
 878
 879 \1f
 880 File: internals.info,  Node: String,  Next: Compiled Function,  Prev: Marker,  Up: Allocation of Objects in XEmacs Lisp
 881
 882 String
 883 ======
 884
 885    As mentioned above, strings are a special case.  A string is
 886 logically two parts, a fixed-size object (containing the length,
 887 property list, and a pointer to the actual data), and the actual data
 888 in the string.  The fixed-size object is a `struct Lisp_String' and is
 889 allocated in frob blocks, as usual.  The actual data is stored in
 890 special "string-chars blocks", which are 8K blocks of memory.
 891 Currently-allocated strings are simply laid end to end in these
 892 string-chars blocks, with a pointer back to the `struct Lisp_String'
 893 stored before each string in the string-chars block.  When a new string
 894 needs to be allocated, the remaining space at the end of the last
 895 string-chars block is used if there's enough, and a new string-chars
 896 block is created otherwise.
 897
 898    There are never any holes in the string-chars blocks due to the
 899 string compaction and relocation that happens at the end of garbage
 900 collection.  During the sweep stage of garbage collection, when objects
 901 are reclaimed, the garbage collector goes through all string-chars
 902 blocks, looking for unused strings.  Each chunk of string data is
 903 preceded by a pointer to the corresponding `struct Lisp_String', which
 904 indicates both whether the string is used and how big the string is,
 905 i.e. how to get to the next chunk of string data.  Holes are compressed
 906 by block-copying the next string into the empty space and relocating the
 907 pointer stored in the corresponding `struct Lisp_String'.  *This means
 908 you have to be careful with strings in your code.* See the section
 909 above on `GCPRO'ing.
 910
 911    Note that there is one situation not handled: a string that is too
 912 big to fit into a string-chars block.  Such strings, called "big
 913 strings", are all `malloc()'ed as their own block. (#### Although it
 914 would make more sense for the threshold for big strings to be somewhat
 915 lower, e.g. 1/2 or 1/4 the size of a string-chars block.  It seems that
 916 this was indeed the case formerly - indeed, the threshold was set at
 917 1/8 - but Mly forgot about this when rewriting things for 19.8.)
 918
 919    Note also that the string data in string-chars blocks is padded as
 920 necessary so that proper alignment constraints on the `struct
 921 Lisp_String' back pointers are maintained.
 922
 923    Finally, strings can be resized.  This happens in Mule when a
 924 character is substituted with a different-length character, or during
 925 modeline frobbing. (You could also export this to Lisp, but it's not
 926 done so currently.) Resizing a string is a potentially tricky process.
 927 If the change is small enough that the padding can absorb it, nothing
 928 other than a simple memory move needs to be done.  Keep in mind,
 929 however, that the string can't shrink too much because the offset to the
 930 next string in the string-chars block is computed by looking at the
 931 length and rounding to the nearest multiple of four or eight.  If the
 932 string would shrink or expand beyond the correct padding, new string
 933 data needs to be allocated at the end of the last string-chars block and
 934 the data moved appropriately.  This leaves some dead string data, which
 935 is marked by putting a special marker of 0xFFFFFFFF in the `struct
 936 Lisp_String' pointer before the data (there's no real `struct
 937 Lisp_String' to point to and relocate), and storing the size of the dead
 938 string data (which would normally be obtained from the now-non-existent
 939 `struct Lisp_String') at the beginning of the dead string data gap.
 940 The string compactor recognizes this special 0xFFFFFFFF marker and
 941 handles it correctly.
 942
 943 \1f
 944 File: internals.info,  Node: Compiled Function,  Prev: String,  Up: Allocation of Objects in XEmacs Lisp
 945
 946 Compiled Function
 947 =================
 948
 949    Not yet documented.
 950
 951 \1f
 952 File: internals.info,  Node: Events and the Event Loop,  Next: Evaluation; Stack Frames; Bindings,  Prev: Allocation of Objects in XEmacs Lisp,  Up: Top
 953
 954 Events and the Event Loop
 955 *************************
 956
 957 * Menu:
 958
 959 * Introduction to Events::
 960 * Main Loop::
 961 * Specifics of the Event Gathering Mechanism::
 962 * Specifics About the Emacs Event::
 963 * The Event Stream Callback Routines::
 964 * Other Event Loop Functions::
 965 * Converting Events::
 966 * Dispatching Events; The Command Builder::
 967
 968 \1f
 969 File: internals.info,  Node: Introduction to Events,  Next: Main Loop,  Up: Events and the Event Loop
 970
 971 Introduction to Events
 972 ======================
 973
 974    An event is an object that encapsulates information about an
 975 interesting occurrence in the operating system.  Events are generated
 976 either by user action, direct (e.g. typing on the keyboard or moving
 977 the mouse) or indirect (moving another window, thereby generating an
 978 expose event on an Emacs frame), or as a result of some other typically
 979 asynchronous action happening, such as output from a subprocess being
 980 ready or a timer expiring.  Events come into the system in an
 981 asynchronous fashion (typically through a callback being called) and
 982 are converted into a synchronous event queue (first-in, first-out) in a
 983 process that we will call "collection".
 984
 985    Note that each application has its own event queue. (It is
 986 immaterial whether the collection process directly puts the events in
 987 the proper application's queue, or puts them into a single system
 988 queue, which is later split up.)
 989
 990    The most basic level of event collection is done by the operating
 991 system or window system.  Typically, XEmacs does its own event
 992 collection as well.  Often there are multiple layers of collection in
 993 XEmacs, with events from various sources being collected into a queue,
 994 which is then combined with other sources to go into another queue
 995 (i.e. a second level of collection), with perhaps another level on top
 996 of this, etc.
 997
 998    XEmacs has its own types of events (called "Emacs events"), which
 999 provides an abstract layer on top of the system-dependent nature of the
1000 most basic events that are received.  Part of the complex nature of the
1001 XEmacs event collection process involves converting from the
1002 operating-system events into the proper Emacs events - there may not be
1003 a one-to-one correspondence.
1004
1005    Emacs events are documented in `events.h'; I'll discuss them later.
1006