info/internals.info-6

   1 This is Info file ../../info/internals.info, produced by Makeinfo
   2 version 1.68 from the input file internals.texi.
   3
   4 INFO-DIR-SECTION XEmacs Editor
   5 START-INFO-DIR-ENTRY
   6 * Internals: (internals).       XEmacs Internals Manual.
   7 END-INFO-DIR-ENTRY
   8
   9    Copyright (C) 1992 - 1996 Ben Wing.  Copyright (C) 1996, 1997 Sun
  10 Microsystems.  Copyright (C) 1994 - 1998 Free Software Foundation.
  11 Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
  12
  13    Permission is granted to make and distribute verbatim copies of this
  14 manual provided the copyright notice and this permission notice are
  15 preserved on all copies.
  16
  17    Permission is granted to copy and distribute modified versions of
  18 this manual under the conditions for verbatim copying, provided that the
  19 entire resulting derived work is distributed under the terms of a
  20 permission notice identical to this one.
  21
  22    Permission is granted to copy and distribute translations of this
  23 manual into another language, under the above conditions for modified
  24 versions, except that this permission notice may be stated in a
  25 translation approved by the Foundation.
  26
  27    Permission is granted to copy and distribute modified versions of
  28 this manual under the conditions for verbatim copying, provided also
  29 that the section entitled "GNU General Public License" is included
  30 exactly as in the original, and provided that the entire resulting
  31 derived work is distributed under the terms of a permission notice
  32 identical to this one.
  33
  34    Permission is granted to copy and distribute translations of this
  35 manual into another language, under the above conditions for modified
  36 versions, except that the section entitled "GNU General Public License"
  37 may be included in a translation approved by the Free Software
  38 Foundation instead of in the original English.
  39
  40 \1f
  41 File: internals.info,  Node: Main Loop,  Next: Specifics of the Event Gathering Mechanism,  Prev: Introduction to Events,  Up: Events and the Event Loop
  42
  43 Main Loop
  44 =========
  45
  46    The "command loop" is the top-level loop that the editor is always
  47 running.  It loops endlessly, calling `next-event' to retrieve an event
  48 and `dispatch-event' to execute it. `dispatch-event' does the
  49 appropriate thing with non-user events (process, timeout, magic, eval,
  50 mouse motion); this involves calling a Lisp handler function, redrawing
  51 a newly-exposed part of a frame, reading subprocess output, etc.  For
  52 user events, `dispatch-event' looks up the event in relevant keymaps or
  53 menubars; when a full key sequence or menubar selection is reached, the
  54 appropriate function is executed. `dispatch-event' may have to keep
  55 state across calls; this is done in the "command-builder" structure
  56 associated with each console (remember, there's usually only one
  57 console), and the engine that looks up keystrokes and constructs full
  58 key sequences is called the "command builder".  This is documented
  59 elsewhere.
  60
  61    The guts of the command loop are in `command_loop_1()'.  This
  62 function doesn't catch errors, though - that's the job of
  63 `command_loop_2()', which is a condition-case (i.e. error-trapping)
  64 wrapper around `command_loop_1()'.  `command_loop_1()' never returns,
  65 but may get thrown out of.
  66
  67    When an error occurs, `cmd_error()' is called, which usually invokes
  68 the Lisp error handler in `command-error'; however, a default error
  69 handler is provided if `command-error' is `nil' (e.g. during startup).
  70 The purpose of the error handler is simply to display the error message
  71 and do associated cleanup; it does not need to throw anywhere.  When
  72 the error handler finishes, the condition-case in `command_loop_2()'
  73 will finish and `command_loop_2()' will reinvoke `command_loop_1()'.
  74
  75    `command_loop_2()' is invoked from three places: from
  76 `initial_command_loop()' (called from `main()' at the end of internal
  77 initialization), from the Lisp function `recursive-edit', and from
  78 `call_command_loop()'.
  79
  80    `call_command_loop()' is called when a macro is started and when the
  81 minibuffer is entered; normal termination of the macro or minibuffer
  82 causes a throw out of the recursive command loop. (To
  83 `execute-kbd-macro' for macros and `exit' for minibuffers.  Note also
  84 that the low-level minibuffer-entering function,
  85 `read-minibuffer-internal', provides its own error handling and does
  86 not need `command_loop_2()''s error encapsulation; so it tells
  87 `call_command_loop()' to invoke `command_loop_1()' directly.)
  88
  89    Note that both read-minibuffer-internal and recursive-edit set up a
  90 catch for `exit'; this is why `abort-recursive-edit', which throws to
  91 this catch, exits out of either one.
  92
  93    `initial_command_loop()', called from `main()', sets up a catch for
  94 `top-level' when invoking `command_loop_2()', allowing functions to
  95 throw all the way to the top level if they really need to.  Before
  96 invoking `command_loop_2()', `initial_command_loop()' calls
  97 `top_level_1()', which handles all of the startup stuff (creating the
  98 initial frame, handling the command-line options, loading the user's
  99 `.emacs' file, etc.).  The function that actually does this is in Lisp
 100 and is pointed to by the variable `top-level'; normally this function is
 101 `normal-top-level'.  `top_level_1()' is just an error-handling wrapper
 102 similar to `command_loop_2()'.  Note also that `initial_command_loop()'
 103 sets up a catch for `top-level' when invoking `top_level_1()', just
 104 like when it invokes `command_loop_2()'.
 105
 106 \1f
 107 File: internals.info,  Node: Specifics of the Event Gathering Mechanism,  Next: Specifics About the Emacs Event,  Prev: Main Loop,  Up: Events and the Event Loop
 108
 109 Specifics of the Event Gathering Mechanism
 110 ==========================================
 111
 112    Here is an approximate diagram of the collection processes at work
 113 in XEmacs, under TTY's (TTY's are simpler than X so we'll look at this
 114 first):
 115
 116       asynch.      asynch.    asynch.   asynch.             [Collectors in
 117      kbd events  kbd events   process   process                the OS]
 118            |         |         output    output
 119            |         |           |         |
 120            |         |           |         |      SIGINT,   [signal handlers
 121            |         |           |         |      SIGQUIT,     in XEmacs]
 122            V         V           V         V      SIGWINCH,
 123           file      file        file      file    SIGALRM
 124           desc.     desc.       desc.     desc.     |
 125           (TTY)     (TTY)       (pipe)    (pipe)    |
 126            |          |          |         |      fake    timeouts
 127            |          |          |         |      file        |
 128            |          |          |         |      desc.       |
 129            |          |          |         |      (pipe)      |
 130            |          |          |         |        |         |
 131            |          |          |         |        |         |
 132            |          |          |         |        |         |
 133            V          V          V         V        V         V
 134            ------>-----------<----------------<----------------
 135                        |
 136                        |
 137                        | [collected using select() in emacs_tty_next_event()
 138                        |  and converted to the appropriate Emacs event]
 139                        |
 140                        |
 141                        V          (above this line is TTY-specific)
 142                      Emacs -----------------------------------------------
 143                      event (below this line is the generic event mechanism)
 144                        |
 145                        |
 146      was there     if not, call
 147      a SIGINT?  emacs_tty_next_event()
 148          |             |
 149          |             |
 150          |             |
 151          V             V
 152          --->------<----
 153                 |
 154                 |     [collected in event_stream_next_event();
 155                 |      SIGINT is converted using maybe_read_quit_event()]
 156                 V
 157               Emacs
 158               event
 159                 |
 160                 \---->------>----- maybe_kbd_translate() ---->---\
 161                                                                  |
 162                                                                  |
 163                                                                  |
 164           command event queue                                    |
 165                                                     if not from command
 166        (contains events that were                   event queue, call
 167        read earlier but not processed,              event_stream_next_event()
 168        typically when waiting in a                               |
 169        sit-for, sleep-for, etc. for                              |
 170       a particular event to be received)                         |
 171                     |                                            |
 172                     |                                            |
 173                     V                                            V
 174                     ---->------------------------------------<----
 175                                                     |
 176                                                     | [collected in
 177                                                     |  next_event_internal()]
 178                                                     |
 179       unread-     unread-       event from          |
 180       command-    command-       keyboard       else, call
 181       events      event           macro      next_event_internal()
 182         |           |               |               |
 183         |           |               |               |
 184         |           |               |               |
 185         V           V               V               V
 186         --------->----------------------<------------
 187                           |
 188                           |      [collected in `next-event', which may loop
 189                           |       more than once if the event it gets is on
 190                           |       a dead frame, device, etc.]
 191                           |
 192                           |
 193                           V
 194                  feed into top-level event loop,
 195                  which repeatedly calls `next-event'
 196                  and then dispatches the event
 197                  using `dispatch-event'
 198
 199    Notice the separation between TTY-specific and generic event
 200 mechanism.  When using the Xt-based event loop, the TTY-specific stuff
 201 is replaced but the rest stays the same.
 202
 203    It's also important to realize that only one different kind of
 204 system-specific event loop can be operating at a time, and must be able
 205 to receive all kinds of events simultaneously.  For the two existing
 206 event loops (implemented in `event-tty.c' and `event-Xt.c',
 207 respectively), the TTY event loop *only* handles TTY consoles, while
 208 the Xt event loop handles *both* TTY and X consoles.  This situation is
 209 different from all of the output handlers, where you simply have one
 210 per console type.
 211
 212    Here's the Xt Event Loop Diagram (notice that below a certain point,
 213 it's the same as the above diagram):
 214
 215      asynch. asynch. asynch. asynch.                 [Collectors in
 216       kbd     kbd    process process                    the OS]
 217      events  events  output  output
 218        |       |       |       |
 219        |       |       |       |     asynch. asynch. [Collectors in the
 220        |       |       |       |       X        X     OS and X Window System]
 221        |       |       |       |     events  events
 222        |       |       |       |       |        |
 223        |       |       |       |       |        |
 224        |       |       |       |       |        |    SIGINT, [signal handlers
 225        |       |       |       |       |        |    SIGQUIT,   in XEmacs]
 226        |       |       |       |       |        |    SIGWINCH,
 227        |       |       |       |       |        |    SIGALRM
 228        |       |       |       |       |        |       |
 229        |       |       |       |       |        |       |
 230        |       |       |       |       |        |       |      timeouts
 231        |       |       |       |       |        |       |          |
 232        |       |       |       |       |        |       |          |
 233        |       |       |       |       |        |       V          |
 234        V       V       V       V       V        V      fake        |
 235       file    file    file    file    file     file    file        |
 236       desc.   desc.   desc.   desc.   desc.    desc.   desc.       |
 237       (TTY)   (TTY)   (pipe)  (pipe) (socket) (socket) (pipe)      |
 238        |       |       |       |       |        |       |          |
 239        |       |       |       |       |        |       |          |
 240        |       |       |       |       |        |       |          |
 241        V       V       V       V       V        V       V          V
 242        --->----------------------------------------<---------<------
 243             |              |               |
 244             |              |               |[collected using select() in
 245             |              |               | _XtWaitForSomething(), called
 246             |              |               | from XtAppProcessEvent(), called
 247             |              |               | in emacs_Xt_next_event();
 248             |              |               | dispatched to various callbacks]
 249             |              |               |
 250             |              |               |
 251        emacs_Xt_        p_s_callback(),    | [popup_selection_callback]
 252        event_handler()  x_u_v_s_callback(),| [x_update_vertical_scrollbar_
 253             |           x_u_h_s_callback(),|  callback]
 254             |           search_callback()  | [x_update_horizontal_scrollbar_
 255             |              |               |  callback]
 256             |              |               |
 257             |              |               |
 258        enqueue_Xt_       signal_special_   |
 259        dispatch_event()  Xt_user_event()   |
 260        [maybe multiple     |               |
 261         times, maybe 0     |               |
 262         times]             |               |
 263             |            enqueue_Xt_       |
 264             |            dispatch_event()  |
 265             |              |               |
 266             |              |               |
 267             V              V               |
 268             -->----------<--               |
 269                    |                       |
 270                    |                       |
 271                 dispatch             Xt_what_callback()
 272                 event                  sets flags
 273                 queue                      |
 274                    |                       |
 275                    |                       |
 276                    |                       |
 277                    |                       |
 278                    ---->-----------<--------
 279                         |
 280                         |
 281                         |     [collected and converted as appropriate in
 282                         |            emacs_Xt_next_event()]
 283                         |
 284                         |
 285                         V          (above this line is Xt-specific)
 286                       Emacs ------------------------------------------------
 287                       event (below this line is the generic event mechanism)
 288                         |
 289                         |
 290      was there      if not, call
 291      a SIGINT?   emacs_Xt_next_event()
 292          |              |
 293          |              |
 294          |              |
 295          V              V
 296          --->-------<----
 297                 |
 298                 |        [collected in event_stream_next_event();
 299                 |         SIGINT is converted using maybe_read_quit_event()]
 300                 V
 301               Emacs
 302               event
 303                 |
 304                 \---->------>----- maybe_kbd_translate() -->-----\
 305                                                                  |
 306                                                                  |
 307                                                                  |
 308           command event queue                                    |
 309                                                    if not from command
 310        (contains events that were                  event queue, call
 311        read earlier but not processed,             event_stream_next_event()
 312        typically when waiting in a                               |
 313        sit-for, sleep-for, etc. for                              |
 314       a particular event to be received)                         |
 315                     |                                            |
 316                     |                                            |
 317                     V                                            V
 318                     ---->----------------------------------<------
 319                                                     |
 320                                                     | [collected in
 321                                                     |  next_event_internal()]
 322                                                     |
 323       unread-     unread-       event from          |
 324       command-    command-       keyboard       else, call
 325       events      event           macro      next_event_internal()
 326         |           |               |               |
 327         |           |               |               |
 328         |           |               |               |
 329         V           V               V               V
 330         --------->----------------------<------------
 331                           |
 332                           |      [collected in `next-event', which may loop
 333                           |       more than once if the event it gets is on
 334                           |       a dead frame, device, etc.]
 335                           |
 336                           |
 337                           V
 338                  feed into top-level event loop,
 339                  which repeatedly calls `next-event'
 340                  and then dispatches the event
 341                  using `dispatch-event'
 342
 343 \1f
 344 File: internals.info,  Node: Specifics About the Emacs Event,  Next: The Event Stream Callback Routines,  Prev: Specifics of the Event Gathering Mechanism,  Up: Events and the Event Loop
 345
 346 Specifics About the Emacs Event
 347 ===============================
 348
 349 \1f
 350 File: internals.info,  Node: The Event Stream Callback Routines,  Next: Other Event Loop Functions,  Prev: Specifics About the Emacs Event,  Up: Events and the Event Loop
 351
 352 The Event Stream Callback Routines
 353 ==================================
 354
 355 \1f
 356 File: internals.info,  Node: Other Event Loop Functions,  Next: Converting Events,  Prev: The Event Stream Callback Routines,  Up: Events and the Event Loop
 357
 358 Other Event Loop Functions
 359 ==========================
 360
 361    `detect_input_pending()' and `input-pending-p' look for input by
 362 calling `event_stream->event_pending_p' and looking in
 363 `[V]unread-command-event' and the `command_event_queue' (they do not
 364 check for an executing keyboard macro, though).
 365
 366    `discard-input' cancels any command events pending (and any keyboard
 367 macros currently executing), and puts the others onto the
 368 `command_event_queue'.  There is a comment about a "race condition",
 369 which is not a good sign.
 370
 371    `next-command-event' and `read-char' are higher-level interfaces to
 372 `next-event'.  `next-command-event' gets the next "command" event (i.e.
 373 keypress, mouse event, menu selection, or scrollbar action), calling
 374 `dispatch-event' on any others.  `read-char' calls `next-command-event'
 375 and uses `event_to_character()' to return the character equivalent.
 376 With the right kind of input method support, it is possible for
 377 (read-char) to return a Kanji character.
 378
 379 \1f
 380 File: internals.info,  Node: Converting Events,  Next: Dispatching Events; The Command Builder,  Prev: Other Event Loop Functions,  Up: Events and the Event Loop
 381
 382 Converting Events
 383 =================
 384
 385    `character_to_event()', `event_to_character()',
 386 `event-to-character', and `character-to-event' convert between
 387 characters and keypress events corresponding to the characters.  If the
 388 event was not a keypress, `event_to_character()' returns -1 and
 389 `event-to-character' returns `nil'.  These functions convert between
 390 character representation and the split-up event representation (keysym
 391 plus mod keys).
 392
 393 \1f
 394 File: internals.info,  Node: Dispatching Events; The Command Builder,  Prev: Converting Events,  Up: Events and the Event Loop
 395
 396 Dispatching Events; The Command Builder
 397 =======================================
 398
 399    Not yet documented.
 400
 401 \1f
 402 File: internals.info,  Node: Evaluation; Stack Frames; Bindings,  Next: Symbols and Variables,  Prev: Events and the Event Loop,  Up: Top
 403
 404 Evaluation; Stack Frames; Bindings
 405 **********************************
 406
 407 * Menu:
 408
 409 * Evaluation::
 410 * Dynamic Binding; The specbinding Stack; Unwind-Protects::
 411 * Simple Special Forms::
 412 * Catch and Throw::
 413
 414 \1f
 415 File: internals.info,  Node: Evaluation,  Next: Dynamic Binding; The specbinding Stack; Unwind-Protects,  Up: Evaluation; Stack Frames; Bindings
 416
 417 Evaluation
 418 ==========
 419
 420    `Feval()' evaluates the form (a Lisp object) that is passed to it.
 421 Note that evaluation is only non-trivial for two types of objects:
 422 symbols and conses.  A symbol is evaluated simply by calling
 423 `symbol-value' on it and returning the value.
 424
 425    Evaluating a cons means calling a function.  First, `eval' checks to
 426 see if garbage-collection is necessary, and calls `garbage_collect_1()'
 427 if so.  It then increases the evaluation depth by 1 (`lisp_eval_depth',
 428 which is always less than `max_lisp_eval_depth') and adds an element to
 429 the linked list of `struct backtrace''s (`backtrace_list').  Each such
 430 structure contains a pointer to the function being called plus a list
 431 of the function's arguments.  Originally these values are stored
 432 unevalled, and as they are evaluated, the backtrace structure is
 433 updated.  Garbage collection pays attention to the objects pointed to
 434 in the backtrace structures (garbage collection might happen while a
 435 function is being called or while an argument is being evaluated, and
 436 there could easily be no other references to the arguments in the
 437 argument list; once an argument is evaluated, however, the unevalled
 438 version is not needed by eval, and so the backtrace structure is
 439 changed).
 440
 441    At this point, the function to be called is determined by looking at
 442 the car of the cons (if this is a symbol, its function definition is
 443 retrieved and the process repeated).  The function should then consist
 444 of either a `Lisp_Subr' (built-in function written in C), a
 445 `Lisp_Compiled_Function' object, or a cons whose car is one of the
 446 symbols `autoload', `macro' or `lambda'.
 447
 448    If the function is a `Lisp_Subr', the lisp object points to a
 449 `struct Lisp_Subr' (created by `DEFUN()'), which contains a pointer to
 450 the C function, a minimum and maximum number of arguments (or possibly
 451 the special constants `MANY' or `UNEVALLED'), a pointer to the symbol
 452 referring to that subr, and a couple of other things.  If the subr
 453 wants its arguments `UNEVALLED', they are passed raw as a list.
 454 Otherwise, an array of evaluated arguments is created and put into the
 455 backtrace structure, and either passed whole (`MANY') or each argument
 456 is passed as a C argument.
 457
 458    If the function is a `Lisp_Compiled_Function',
 459 `funcall_compiled_function()' is called.  If the function is a lambda
 460 list, `funcall_lambda()' is called.  If the function is a macro, [.....
 461 fill in] is done.  If the function is an autoload, `do_autoload()' is
 462 called to load the definition and then eval starts over [explain this
 463 more].
 464
 465    When `Feval()' exits, the evaluation depth is reduced by one, the
 466 debugger is called if appropriate, and the current backtrace structure
 467 is removed from the list.
 468
 469    Both `funcall_compiled_function()' and `funcall_lambda()' need to go
 470 through the list of formal parameters to the function and bind them to
 471 the actual arguments, checking for `&rest' and `&optional' symbols in
 472 the formal parameters and making sure the number of actual arguments is
 473 correct.  `funcall_compiled_function()' can do this a little more
 474 efficiently, since the formal parameter list can be checked for sanity
 475 when the compiled function object is created.
 476
 477    `funcall_lambda()' simply calls `Fprogn' to execute the code in the
 478 lambda list.
 479
 480    `funcall_compiled_function()' calls the real byte-code interpreter
 481 `execute_optimized_program()' on the byte-code instructions, which are
 482 converted into an internal form for faster execution.
 483
 484    When a compiled function is executed for the first time by
 485 `funcall_compiled_function()', or when it is `Fpurecopy()'ed during the
 486 dump phase of building XEmacs, the byte-code instructions are converted
 487 from a `Lisp_String' (which is inefficient to access, especially in the
 488 presence of MULE) into a `Lisp_Opaque' object containing an array of
 489 unsigned char, which can be directly executed by the byte-code
 490 interpreter.  At this time the byte code is also analyzed for validity
 491 and transformed into a more optimized form, so that
 492 `execute_optimized_program()' can really fly.
 493
 494    Here are some of the optimizations performed by the internal
 495 byte-code transformer:
 496   1. References to the `constants' array are checked for out-of-range
 497      indices, so that the byte interpreter doesn't have to.
 498
 499   2. References to the `constants' array that will be used as a Lisp
 500      variable are checked for being correct non-constant (i.e. not `t',
 501      `nil', or `keywordp') symbols, so that the byte interpreter
 502      doesn't have to.
 503
 504   3. The maxiumum number of variable bindings in the byte-code is
 505      pre-computed, so that space on the `specpdl' stack can be
 506      pre-reserved once for the whole function execution.
 507
 508   4. All byte-code jumps are relative to the current program counter
 509      instead of the start of the program, thereby saving a register.
 510
 511   5. One-byte relative jumps are converted from the byte-code form of
 512      unsigned chars offset by 127 to machine-friendly signed chars.
 513
 514    Of course, this transformation of the `instructions' should not be
 515 visible to the user, so `Fcompiled_function_instructions()' needs to
 516 know how to convert the optimized opaque object back into a Lisp string
 517 that is identical to the original string from the `.elc' file.
 518 (Actually, the resulting string may (rarely) contain slightly
 519 different, yet equivalent, byte code.)
 520
 521    `Ffuncall()' implements Lisp `funcall'.  `(funcall fun x1 x2 x3
 522 ...)' is equivalent to `(eval (list fun (quote x1) (quote x2) (quote
 523 x3) ...))'.  `Ffuncall()' contains its own code to do the evaluation,
 524 however, and is very similar to `Feval()'.
 525
 526    From the performance point of view, it is worth knowing that most of
 527 the time in Lisp evaluation is spent executing `Lisp_Subr' and
 528 `Lisp_Compiled_Function' objects via `Ffuncall()' (not `Feval()').
 529
 530    `Fapply()' implements Lisp `apply', which is very similar to
 531 `funcall' except that if the last argument is a list, the result is the
 532 same as if each of the arguments in the list had been passed separately.
 533 `Fapply()' does some business to expand the last argument if it's a
 534 list, then calls `Ffuncall()' to do the work.
 535
 536    `apply1()', `call0()', `call1()', `call2()', and `call3()' call a
 537 function, passing it the argument(s) given (the arguments are given as
 538 separate C arguments rather than being passed as an array).  `apply1()'
 539 uses `Fapply()' while the others use `Ffuncall()' to do the real work.
 540
 541 \1f
 542 File: internals.info,  Node: Dynamic Binding; The specbinding Stack; Unwind-Protects,  Next: Simple Special Forms,  Prev: Evaluation,  Up: Evaluation; Stack Frames; Bindings
 543
 544 Dynamic Binding; The specbinding Stack; Unwind-Protects
 545 =======================================================
 546
 547      struct specbinding
 548      {
 549        Lisp_Object symbol;
 550        Lisp_Object old_value;
 551        Lisp_Object (*func) (Lisp_Object); /* for unwind-protect */
 552      };
 553
 554    `struct specbinding' is used for local-variable bindings and
 555 unwind-protects.  `specpdl' holds an array of `struct specbinding''s,
 556 `specpdl_ptr' points to the beginning of the free bindings in the
 557 array, `specpdl_size' specifies the total number of binding slots in
 558 the array, and `max_specpdl_size' specifies the maximum number of
 559 bindings the array can be expanded to hold.  `grow_specpdl()' increases
 560 the size of the `specpdl' array, multiplying its size by 2 but never
 561 exceeding `max_specpdl_size' (except that if this number is less than
 562 400, it is first set to 400).
 563
 564    `specbind()' binds a symbol to a value and is used for local
 565 variables and `let' forms.  The symbol and its old value (which might
 566 be `Qunbound', indicating no prior value) are recorded in the specpdl
 567 array, and `specpdl_size' is increased by 1.
 568
 569    `record_unwind_protect()' implements an "unwind-protect", which,
 570 when placed around a section of code, ensures that some specified
 571 cleanup routine will be executed even if the code exits abnormally
 572 (e.g. through a `throw' or quit).  `record_unwind_protect()' simply
 573 adds a new specbinding to the `specpdl' array and stores the
 574 appropriate information in it.  The cleanup routine can either be a C
 575 function, which is stored in the `func' field, or a `progn' form, which
 576 is stored in the `old_value' field.
 577
 578    `unbind_to()' removes specbindings from the `specpdl' array until
 579 the specified position is reached.  Each specbinding can be one of
 580 three types:
 581
 582   1. an unwind-protect with a C cleanup function (`func' is not 0, and
 583      `old_value' holds an argument to be passed to the function);
 584
 585   2. an unwind-protect with a Lisp form (`func' is 0, `symbol' is
 586      `nil', and `old_value' holds the form to be executed with
 587      `Fprogn()'); or
 588
 589   3. a local-variable binding (`func' is 0, `symbol' is not `nil', and
 590      `old_value' holds the old value, which is stored as the symbol's
 591      value).
 592
 593 \1f
 594 File: internals.info,  Node: Simple Special Forms,  Next: Catch and Throw,  Prev: Dynamic Binding; The specbinding Stack; Unwind-Protects,  Up: Evaluation; Stack Frames; Bindings
 595
 596 Simple Special Forms
 597 ====================
 598
 599    `or', `and', `if', `cond', `progn', `prog1', `prog2', `setq',
 600 `quote', `function', `let*', `let', `while'
 601
 602    All of these are very simple and work as expected, calling `Feval()'
 603 or `Fprogn()' as necessary and (in the case of `let' and `let*') using
 604 `specbind()' to create bindings and `unbind_to()' to undo the bindings
 605 when finished.
 606
 607    Note that, with the exeption of `Fprogn', these functions are
 608 typically called in real life only in interpreted code, since the byte
 609 compiler knows how to convert calls to these functions directly into
 610 byte code.
 611
 612 \1f
 613 File: internals.info,  Node: Catch and Throw,  Prev: Simple Special Forms,  Up: Evaluation; Stack Frames; Bindings
 614
 615 Catch and Throw
 616 ===============
 617
 618      struct catchtag
 619      {
 620        Lisp_Object tag;
 621        Lisp_Object val;
 622        struct catchtag *next;
 623        struct gcpro *gcpro;
 624        jmp_buf jmp;
 625        struct backtrace *backlist;
 626        int lisp_eval_depth;
 627        int pdlcount;
 628      };
 629
 630    `catch' is a Lisp function that places a catch around a body of
 631 code.  A catch is a means of non-local exit from the code.  When a catch
 632 is created, a tag is specified, and executing a `throw' to this tag
 633 will exit from the body of code caught with this tag, and its value will
 634 be the value given in the call to `throw'.  If there is no such call,
 635 the code will be executed normally.
 636
 637    Information pertaining to a catch is held in a `struct catchtag',
 638 which is placed at the head of a linked list pointed to by `catchlist'.
 639 `internal_catch()' is passed a C function to call (`Fprogn()' when
 640 Lisp `catch' is called) and arguments to give it, and places a catch
 641 around the function.  Each `struct catchtag' is held in the stack frame
 642 of the `internal_catch()' instance that created the catch.
 643
 644    `internal_catch()' is fairly straightforward.  It stores into the
 645 `struct catchtag' the tag name and the current values of
 646 `backtrace_list', `lisp_eval_depth', `gcprolist', and the offset into
 647 the `specpdl' array, sets a jump point with `_setjmp()' (storing the
 648 jump point into the `struct catchtag'), and calls the function.
 649 Control will return to `internal_catch()' either when the function
 650 exits normally or through a `_longjmp()' to this jump point.  In the
 651 latter case, `throw' will store the value to be returned into the
 652 `struct catchtag' before jumping.  When it's done, `internal_catch()'
 653 removes the `struct catchtag' from the catchlist and returns the proper
 654 value.
 655
 656    `Fthrow()' goes up through the catchlist until it finds one with a
 657 matching tag.  It then calls `unbind_catch()' to restore everything to
 658 what it was when the appropriate catch was set, stores the return value
 659 in the `struct catchtag', and jumps (with `_longjmp()') to its jump
 660 point.
 661
 662    `unbind_catch()' removes all catches from the catchlist until it
 663 finds the correct one.  Some of the catches might have been placed for
 664 error-trapping, and if so, the appropriate entries on the handlerlist
 665 must be removed (see "errors").  `unbind_catch()' also restores the
 666 values of `gcprolist', `backtrace_list', and `lisp_eval', and calls
 667 `unbind_to()' to undo any specbindings created since the catch.
 668
 669 \1f
 670 File: internals.info,  Node: Symbols and Variables,  Next: Buffers and Textual Representation,  Prev: Evaluation; Stack Frames; Bindings,  Up: Top
 671
 672 Symbols and Variables
 673 *********************
 674
 675 * Menu:
 676
 677 * Introduction to Symbols::
 678 * Obarrays::
 679 * Symbol Values::
 680
 681 \1f
 682 File: internals.info,  Node: Introduction to Symbols,  Next: Obarrays,  Up: Symbols and Variables
 683
 684 Introduction to Symbols
 685 =======================
 686
 687    A symbol is basically just an object with four fields: a name (a
 688 string), a value (some Lisp object), a function (some Lisp object), and
 689 a property list (usually a list of alternating keyword/value pairs).
 690 What makes symbols special is that there is usually only one symbol with
 691 a given name, and the symbol is referred to by name.  This makes a
 692 symbol a convenient way of calling up data by name, i.e. of implementing
 693 variables. (The variable's value is stored in the "value slot".)
 694 Similarly, functions are referenced by name, and the definition of the
 695 function is stored in a symbol's "function slot".  This means that
 696 there can be a distinct function and variable with the same name.  The
 697 property list is used as a more general mechanism of associating
 698 additional values with particular names, and once again the namespace is
 699 independent of the function and variable namespaces.
 700
 701 \1f
 702 File: internals.info,  Node: Obarrays,  Next: Symbol Values,  Prev: Introduction to Symbols,  Up: Symbols and Variables
 703
 704 Obarrays
 705 ========
 706
 707    The identity of symbols with their names is accomplished through a
 708 structure called an obarray, which is just a poorly-implemented hash
 709 table mapping from strings to symbols whose name is that string. (I say
 710 "poorly implemented" because an obarray appears in Lisp as a vector
 711 with some hidden fields rather than as its own opaque type.  This is an
 712 Emacs Lisp artifact that should be fixed.)
 713
 714    Obarrays are implemented as a vector of some fixed size (which should
 715 be a prime for best results), where each "bucket" of the vector
 716 contains one or more symbols, threaded through a hidden `next' field in
 717 the symbol.  Lookup of a symbol in an obarray, and adding a symbol to
 718 an obarray, is accomplished through standard hash-table techniques.
 719
 720    The standard Lisp function for working with symbols and obarrays is
 721 `intern'.  This looks up a symbol in an obarray given its name; if it's
 722 not found, a new symbol is automatically created with the specified
 723 name, added to the obarray, and returned.  This is what happens when the
 724 Lisp reader encounters a symbol (or more precisely, encounters the name
 725 of a symbol) in some text that it is reading.  There is a standard
 726 obarray called `obarray' that is used for this purpose, although the
 727 Lisp programmer is free to create his own obarrays and `intern' symbols
 728 in them.
 729
 730    Note that, once a symbol is in an obarray, it stays there until
 731 something is done about it, and the standard obarray `obarray' always
 732 stays around, so once you use any particular variable name, a
 733 corresponding symbol will stay around in `obarray' until you exit
 734 XEmacs.
 735
 736    Note that `obarray' itself is a variable, and as such there is a
 737 symbol in `obarray' whose name is `"obarray"' and which contains
 738 `obarray' as its value.
 739
 740    Note also that this call to `intern' occurs only when in the Lisp
 741 reader, not when the code is executed (at which point the symbol is
 742 already around, stored as such in the definition of the function).
 743
 744    You can create your own obarray using `make-vector' (this is
 745 horrible but is an artifact) and intern symbols into that obarray.
 746 Doing that will result in two or more symbols with the same name.
 747 However, at most one of these symbols is in the standard `obarray': You
 748 cannot have two symbols of the same name in any particular obarray.
 749 Note that you cannot add a symbol to an obarray in any fashion other
 750 than using `intern': i.e. you can't take an existing symbol and put it
 751 in an existing obarray.  Nor can you change the name of an existing
 752 symbol. (Since obarrays are vectors, you can violate the consistency of
 753 things by storing directly into the vector, but let's ignore that
 754 possibility.)
 755
 756    Usually symbols are created by `intern', but if you really want, you
 757 can explicitly create a symbol using `make-symbol', giving it some
 758 name.  The resulting symbol is not in any obarray (i.e. it is
 759 "uninterned"), and you can't add it to any obarray.  Therefore its
 760 primary purpose is as a symbol to use in macros to avoid namespace
 761 pollution.  It can also be used as a carrier of information, but cons
 762 cells could probably be used just as well.
 763
 764    You can also use `intern-soft' to look up a symbol but not create a
 765 new one, and `unintern' to remove a symbol from an obarray.  This
 766 returns the removed symbol. (Remember: You can't put the symbol back
 767 into any obarray.) Finally, `mapatoms' maps over all of the symbols in
 768 an obarray.
 769
 770 \1f
 771 File: internals.info,  Node: Symbol Values,  Prev: Obarrays,  Up: Symbols and Variables
 772
 773 Symbol Values
 774 =============
 775
 776    The value field of a symbol normally contains a Lisp object.
 777 However, a symbol can be "unbound", meaning that it logically has no
 778 value.  This is internally indicated by storing a special Lisp object,
 779 called "the unbound marker" and stored in the global variable
 780 `Qunbound'.  The unbound marker is of a special Lisp object type called
 781 "symbol-value-magic".  It is impossible for the Lisp programmer to
 782 directly create or access any object of this type.
 783
 784    *You must not let any "symbol-value-magic" object escape to the Lisp
 785 level.*  Printing any of these objects will cause the message `INTERNAL
 786 EMACS BUG' to appear as part of the print representation.  (You may see
 787 this normally when you call `debug_print()' from the debugger on a Lisp
 788 object.) If you let one of these objects escape to the Lisp level, you
 789 will violate a number of assumptions contained in the C code and make
 790 the unbound marker not function right.
 791
 792    When a symbol is created, its value field (and function field) are
 793 set to `Qunbound'.  The Lisp programmer can restore these conditions
 794 later using `makunbound' or `fmakunbound', and can query to see whether
 795 the value of function fields are "bound" (i.e. have a value other than
 796 `Qunbound') using `boundp' and `fboundp'.  The fields are set to a
 797 normal Lisp object using `set' (or `setq') and `fset'.
 798
 799    Other symbol-value-magic objects are used as special markers to
 800 indicate variables that have non-normal properties.  This includes any
 801 variables that are tied into C variables (setting the variable magically
 802 sets some global variable in the C code, and likewise for retrieving the
 803 variable's value), variables that magically tie into slots in the
 804 current buffer, variables that are buffer-local, etc.  The
 805 symbol-value-magic object is stored in the value cell in place of a
 806 normal object, and the code to retrieve a symbol's value (i.e.
 807 `symbol-value') knows how to do special things with them.  This means
 808 that you should not just fetch the value cell directly if you want a
 809 symbol's value.
 810
 811    The exact workings of this are rather complex and involved and are
 812 well-documented in comments in `buffer.c', `symbols.c', and `lisp.h'.
 813
 814 \1f
 815 File: internals.info,  Node: Buffers and Textual Representation,  Next: MULE Character Sets and Encodings,  Prev: Symbols and Variables,  Up: Top
 816
 817 Buffers and Textual Representation
 818 **********************************
 819
 820 * Menu:
 821
 822 * Introduction to Buffers::     A buffer holds a block of text such as a file.
 823 * The Text in a Buffer::        Representation of the text in a buffer.
 824 * Buffer Lists::                Keeping track of all buffers.
 825 * Markers and Extents::         Tagging locations within a buffer.
 826 * Bufbytes and Emchars::        Representation of individual characters.
 827 * The Buffer Object::           The Lisp object corresponding to a buffer.
 828
 829 \1f
 830 File: internals.info,  Node: Introduction to Buffers,  Next: The Text in a Buffer,  Up: Buffers and Textual Representation
 831
 832 Introduction to Buffers
 833 =======================
 834
 835    A buffer is logically just a Lisp object that holds some text.  In
 836 this, it is like a string, but a buffer is optimized for frequent
 837 insertion and deletion, while a string is not.  Furthermore:
 838
 839   1. Buffers are "permanent" objects, i.e. once you create them, they
 840      remain around, and need to be explicitly deleted before they go
 841      away.
 842
 843   2. Each buffer has a unique name, which is a string.  Buffers are
 844      normally referred to by name.  In this respect, they are like
 845      symbols.
 846
 847   3. Buffers have a default insertion position, called "point".
 848      Inserting text (unless you explicitly give a position) goes at
 849      point, and moves point forward past the text.  This is what is
 850      going on when you type text into Emacs.
 851
 852   4. Buffers have lots of extra properties associated with them.
 853
 854   5. Buffers can be "displayed".  What this means is that there exist a
 855      number of "windows", which are objects that correspond to some
 856      visible section of your display, and each window has an associated
 857      buffer, and the current contents of the buffer are shown in that
 858      section of the display.  The redisplay mechanism (which takes care
 859      of doing this) knows how to look at the text of a buffer and come
 860      up with some reasonable way of displaying this.  Many of the
 861      properties of a buffer control how the buffer's text is displayed.
 862
 863   6. One buffer is distinguished and called the "current buffer".  It is
 864      stored in the variable `current_buffer'.  Buffer operations operate
 865      on this buffer by default.  When you are typing text into a
 866      buffer, the buffer you are typing into is always `current_buffer'.
 867      Switching to a different window changes the current buffer.  Note
 868      that Lisp code can temporarily change the current buffer using
 869      `set-buffer' (often enclosed in a `save-excursion' so that the
 870      former current buffer gets restored when the code is finished).
 871      However, calling `set-buffer' will NOT cause a permanent change in
 872      the current buffer.  The reason for this is that the top-level
 873      event loop sets `current_buffer' to the buffer of the selected
 874      window, each time it finishes executing a user command.
 875
 876    Make sure you understand the distinction between "current buffer"
 877 and "buffer of the selected window", and the distinction between
 878 "point" of the current buffer and "window-point" of the selected
 879 window. (This latter distinction is explained in detail in the section
 880 on windows.)
 881
 882 \1f
 883 File: internals.info,  Node: The Text in a Buffer,  Next: Buffer Lists,  Prev: Introduction to Buffers,  Up: Buffers and Textual Representation
 884
 885 The Text in a Buffer
 886 ====================
 887
 888    The text in a buffer consists of a sequence of zero or more
 889 characters.  A "character" is an integer that logically represents a
 890 letter, number, space, or other unit of text.  Most of the characters
 891 that you will typically encounter belong to the ASCII set of characters,
 892 but there are also characters for various sorts of accented letters,
 893 special symbols, Chinese and Japanese ideograms (i.e. Kanji, Katakana,
 894 etc.), Cyrillic and Greek letters, etc.  The actual number of possible
 895 characters is quite large.
 896
 897    For now, we can view a character as some non-negative integer that
 898 has some shape that defines how it typically appears (e.g. as an
 899 uppercase A). (The exact way in which a character appears depends on the
 900 font used to display the character.) The internal type of characters in
 901 the C code is an `Emchar'; this is just an `int', but using a symbolic
 902 type makes the code clearer.
 903
 904    Between every character in a buffer is a "buffer position" or
 905 "character position".  We can speak of the character before or after a
 906 particular buffer position, and when you insert a character at a
 907 particular position, all characters after that position end up at new
 908 positions.  When we speak of the character "at" a position, we really
 909 mean the character after the position.  (This schizophrenia between a
 910 buffer position being "between" a character and "on" a character is
 911 rampant in Emacs.)
 912
 913    Buffer positions are numbered starting at 1.  This means that
 914 position 1 is before the first character, and position 0 is not valid.
 915 If there are N characters in a buffer, then buffer position N+1 is
 916 after the last one, and position N+2 is not valid.
 917
 918    The internal makeup of the Emchar integer varies depending on whether
 919 we have compiled with MULE support.  If not, the Emchar integer is an
 920 8-bit integer with possible values from 0 - 255.  0 - 127 are the
 921 standard ASCII characters, while 128 - 255 are the characters from the
 922 ISO-8859-1 character set.  If we have compiled with MULE support, an
 923 Emchar is a 19-bit integer, with the various bits having meanings
 924 according to a complex scheme that will be detailed later.  The
 925 characters numbered 0 - 255 still have the same meanings as for the
 926 non-MULE case, though.
 927
 928    Internally, the text in a buffer is represented in a fairly simple
 929 fashion: as a contiguous array of bytes, with a "gap" of some size in
 930 the middle.  Although the gap is of some substantial size in bytes,
 931 there is no text contained within it: From the perspective of the text
 932 in the buffer, it does not exist.  The gap logically sits at some buffer
 933 position, between two characters (or possibly at the beginning or end of
 934 the buffer).  Insertion of text in a buffer at a particular position is
 935 always accomplished by first moving the gap to that position (i.e.
 936 through some block moving of text), then writing the text into the
 937 beginning of the gap, thereby shrinking the gap.  If the gap shrinks
 938 down to nothing, a new gap is created. (What actually happens is that a
 939 new gap is "created" at the end of the buffer's text, which requires
 940 nothing more than changing a couple of indices; then the gap is "moved"
 941 to the position where the insertion needs to take place by moving up in
 942 memory all the text after that position.)  Similarly, deletion occurs
 943 by moving the gap to the place where the text is to be deleted, and
 944 then simply expanding the gap to include the deleted text.
 945 ("Expanding" and "shrinking" the gap as just described means just that
 946 the internal indices that keep track of where the gap is located are
 947 changed.)
 948
 949    Note that the total amount of memory allocated for a buffer text
 950 never decreases while the buffer is live.  Therefore, if you load up a
 951 20-megabyte file and then delete all but one character, there will be a
 952 20-megabyte gap, which won't get any smaller (except by inserting
 953 characters back again).  Once the buffer is killed, the memory allocated
 954 for the buffer text will be freed, but it will still be sitting on the
 955 heap, taking up virtual memory, and will not be released back to the
 956 operating system. (However, if you have compiled XEmacs with rel-alloc,
 957 the situation is different.  In this case, the space *will* be released
 958 back to the operating system.  However, this tends to result in a
 959 noticeable speed penalty.)
 960
 961    Astute readers may notice that the text in a buffer is represented as
 962 an array of *bytes*, while (at least in the MULE case) an Emchar is a
 963 19-bit integer, which clearly cannot fit in a byte.  This means (of
 964 course) that the text in a buffer uses a different representation from
 965 an Emchar: specifically, the 19-bit Emchar becomes a series of one to
 966 four bytes.  The conversion between these two representations is complex
 967 and will be described later.
 968
 969    In the non-MULE case, everything is very simple: An Emchar is an
 970 8-bit value, which fits neatly into one byte.
 971
 972    If we are given a buffer position and want to retrieve the character
 973 at that position, we need to follow these steps:
 974
 975   1. Pretend there's no gap, and convert the buffer position into a
 976      "byte index" that indexes to the appropriate byte in the buffer's
 977      stream of textual bytes.  By convention, byte indices begin at 1,
 978      just like buffer positions.  In the non-MULE case, byte indices
 979      and buffer positions are identical, since one character equals one
 980      byte.
 981
 982   2. Convert the byte index into a "memory index", which takes the gap
 983      into account.  The memory index is a direct index into the block of
 984      memory that stores the text of a buffer.  This basically just
 985      involves checking to see if the byte index is past the gap, and if
 986      so, adding the size of the gap to it.  By convention, memory
 987      indices begin at 1, just like buffer positions and byte indices,
 988      and when referring to the position that is "at" the gap, we always
 989      use the memory position at the *beginning*, not at the end, of the
 990      gap.
 991
 992   3. Fetch the appropriate bytes at the determined memory position.
 993
 994   4. Convert these bytes into an Emchar.
 995
 996    In the non-Mule case, (3) and (4) boil down to a simple one-byte
 997 memory access.
 998
 999    Note that we have defined three types of positions in a buffer:
1000
1001   1. "buffer positions" or "character positions", typedef `Bufpos'
1002
1003   2. "byte indices", typedef `Bytind'
1004
1005   3. "memory indices", typedef `Memind'
1006
1007    All three typedefs are just `int's, but defining them this way makes
1008 things a lot clearer.
1009
1010    Most code works with buffer positions.  In particular, all Lisp code
1011 that refers to text in a buffer uses buffer positions.  Lisp code does
1012 not know that byte indices or memory indices exist.
1013
1014    Finally, we have a typedef for the bytes in a buffer.  This is a
1015 `Bufbyte', which is an unsigned char.  Referring to them as Bufbytes
1016 underscores the fact that we are working with a string of bytes in the
1017 internal Emacs buffer representation rather than in one of a number of
1018 possible alternative representations (e.g. EUC-encoded text, etc.).
1019