This is Info file ../../info/internals.info, produced by Makeinfo version 1.68 from the input file internals.texi. INFO-DIR-SECTION XEmacs Editor START-INFO-DIR-ENTRY * Internals: (internals). XEmacs Internals Manual. END-INFO-DIR-ENTRY Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun Microsystems. Copyright (C) 1994 - 1998 Free Software Foundation. Copyright (C) 1994, 1995 Board of Trustees, University of Illinois. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the section entitled "GNU General Public License" is included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that the section entitled "GNU General Public License" may be included in a translation approved by the Free Software Foundation instead of in the original English.  File: internals.info, Node: Buffer Lists, Next: Markers and Extents, Prev: The Text in a Buffer, Up: Buffers and Textual Representation Buffer Lists ============ Recall earlier that buffers are "permanent" objects, i.e. that they remain around until explicitly deleted. This entails that there is a list of all the buffers in existence. This list is actually an assoc-list (mapping from the buffer's name to the buffer) and is stored in the global variable `Vbuffer_alist'. The order of the buffers in the list is important: the buffers are ordered approximately from most-recently-used to least-recently-used. Switching to a buffer using `switch-to-buffer', `pop-to-buffer', etc. and switching windows using `other-window', etc. usually brings the new current buffer to the front of the list. `switch-to-buffer', `other-buffer', etc. look at the beginning of the list to find an alternative buffer to suggest. You can also explicitly move a buffer to the end of the list using `bury-buffer'. In addition to the global ordering in `Vbuffer_alist', each frame has its own ordering of the list. These lists always contain the same elements as in `Vbuffer_alist' although possibly in a different order. `buffer-list' normally returns the list for the selected frame. This allows you to work in separate frames without things interfering with each other. The standard way to look up a buffer given a name is `get-buffer', and the standard way to create a new buffer is `get-buffer-create', which looks up a buffer with a given name, creating a new one if necessary. These operations correspond exactly with the symbol operations `intern-soft' and `intern', respectively. You can also force a new buffer to be created using `generate-new-buffer', which takes a name and (if necessary) makes a unique name from this by appending a number, and then creates the buffer. This is basically like the symbol operation `gensym'.  File: internals.info, Node: Markers and Extents, Next: Bufbytes and Emchars, Prev: Buffer Lists, Up: Buffers and Textual Representation Markers and Extents =================== Among the things associated with a buffer are things that are logically attached to certain buffer positions. This can be used to keep track of a buffer position when text is inserted and deleted, so that it remains at the same spot relative to the text around it; to assign properties to particular sections of text; etc. There are two such objects that are useful in this regard: they are "markers" and "extents". A "marker" is simply a flag placed at a particular buffer position, which is moved around as text is inserted and deleted. Markers are used for all sorts of purposes, such as the `mark' that is the other end of textual regions to be cut, copied, etc. An "extent" is similar to two markers plus some associated properties, and is used to keep track of regions in a buffer as text is inserted and deleted, and to add properties (e.g. fonts) to particular regions of text. The external interface of extents is explained elsewhere. The important thing here is that markers and extents simply contain buffer positions in them as integers, and every time text is inserted or deleted, these positions must be updated. In order to minimize the amount of shuffling that needs to be done, the positions in markers and extents (there's one per marker, two per extent) and stored in Meminds. This means that they only need to be moved when the text is physically moved in memory; since the gap structure tries to minimize this, it also minimizes the number of marker and extent indices that need to be adjusted. Look in `insdel.c' for the details of how this works. One other important distinction is that markers are "temporary" while extents are "permanent". This means that markers disappear as soon as there are no more pointers to them, and correspondingly, there is no way to determine what markers are in a buffer if you are just given the buffer. Extents remain in a buffer until they are detached (which could happen as a result of text being deleted) or the buffer is deleted, and primitives do exist to enumerate the extents in a buffer.  File: internals.info, Node: Bufbytes and Emchars, Next: The Buffer Object, Prev: Markers and Extents, Up: Buffers and Textual Representation Bufbytes and Emchars ==================== Not yet documented.  File: internals.info, Node: The Buffer Object, Prev: Bufbytes and Emchars, Up: Buffers and Textual Representation The Buffer Object ================= Buffers contain fields not directly accessible by the Lisp programmer. We describe them here, naming them by the names used in the C code. Many are accessible indirectly in Lisp programs via Lisp primitives. `name' The buffer name is a string that names the buffer. It is guaranteed to be unique. *Note Buffer Names: (lispref)Buffer Names. `save_modified' This field contains the time when the buffer was last saved, as an integer. *Note Buffer Modification: (lispref)Buffer Modification. `modtime' This field contains the modification time of the visited file. It is set when the file is written or read. Every time the buffer is written to the file, this field is compared to the modification time of the file. *Note Buffer Modification: (lispref)Buffer Modification. `auto_save_modified' This field contains the time when the buffer was last auto-saved. `last_window_start' This field contains the `window-start' position in the buffer as of the last time the buffer was displayed in a window. `undo_list' This field points to the buffer's undo list. *Note Undo: (lispref)Undo. `syntax_table_v' This field contains the syntax table for the buffer. *Note Syntax Tables: (lispref)Syntax Tables. `downcase_table' This field contains the conversion table for converting text to lower case. *Note Case Tables: (lispref)Case Tables. `upcase_table' This field contains the conversion table for converting text to upper case. *Note Case Tables: (lispref)Case Tables. `case_canon_table' This field contains the conversion table for canonicalizing text for case-folding search. *Note Case Tables: (lispref)Case Tables. `case_eqv_table' This field contains the equivalence table for case-folding search. *Note Case Tables: (lispref)Case Tables. `display_table' This field contains the buffer's display table, or `nil' if it doesn't have one. *Note Display Tables: (lispref)Display Tables. `markers' This field contains the chain of all markers that currently point into the buffer. Deletion of text in the buffer, and motion of the buffer's gap, must check each of these markers and perhaps update it. *Note Markers: (lispref)Markers. `backed_up' This field is a flag that tells whether a backup file has been made for the visited file of this buffer. `mark' This field contains the mark for the buffer. The mark is a marker, hence it is also included on the list `markers'. *Note The Mark: (lispref)The Mark. `mark_active' This field is non-`nil' if the buffer's mark is active. `local_var_alist' This field contains the association list describing the variables local in this buffer, and their values, with the exception of local variables that have special slots in the buffer object. (Those slots are omitted from this table.) *Note Buffer-Local Variables: (lispref)Buffer-Local Variables. `modeline_format' This field contains a Lisp object which controls how to display the mode line for this buffer. *Note Modeline Format: (lispref)Modeline Format. `base_buffer' This field holds the buffer's base buffer (if it is an indirect buffer), or `nil'.  File: internals.info, Node: MULE Character Sets and Encodings, Next: The Lisp Reader and Compiler, Prev: Buffers and Textual Representation, Up: Top MULE Character Sets and Encodings ********************************* Recall that there are two primary ways that text is represented in XEmacs. The "buffer" representation sees the text as a series of bytes (Bufbytes), with a variable number of bytes used per character. The "character" representation sees the text as a series of integers (Emchars), one per character. The character representation is a cleaner representation from a theoretical standpoint, and is thus used in many cases when lots of manipulations on a string need to be done. However, the buffer representation is the standard representation used in both Lisp strings and buffers, and because of this, it is the "default" representation that text comes in. The reason for using this representation is that it's compact and is compatible with ASCII. * Menu: * Character Sets:: * Encodings:: * Internal Mule Encodings:: * CCL::  File: internals.info, Node: Character Sets, Next: Encodings, Up: MULE Character Sets and Encodings Character Sets ============== A character set (or "charset") is an ordered set of characters. A particular character in a charset is indexed using one or more "position codes", which are non-negative integers. The number of position codes needed to identify a particular character in a charset is called the "dimension" of the charset. In XEmacs/Mule, all charsets have dimension 1 or 2, and the size of all charsets (except for a few special cases) is either 94, 96, 94 by 94, or 96 by 96. The range of position codes used to index characters from any of these types of character sets is as follows: Charset type Position code 1 Position code 2 ------------------------------------------------------------ 94 33 - 126 N/A 96 32 - 127 N/A 94x94 33 - 126 33 - 126 96x96 32 - 127 32 - 127 Note that in the above cases position codes do not start at an expected value such as 0 or 1. The reason for this will become clear later. For example, Latin-1 is a 96-character charset, and JISX0208 (the Japanese national character set) is a 94x94-character charset. [Note that, although the ranges above define the *valid* position codes for a charset, some of the slots in a particular charset may in fact be empty. This is the case for JISX0208, for example, where (e.g.) all the slots whose first position code is in the range 118 - 127 are empty.] There are three charsets that do not follow the above rules. All of them have one dimension, and have ranges of position codes as follows: Charset name Position code 1 ------------------------------------ ASCII 0 - 127 Control-1 0 - 31 Composite 0 - some large number (The upper bound of the position code for composite characters has not yet been determined, but it will probably be at least 16,383). ASCII is the union of two subsidiary character sets: Printing-ASCII (the printing ASCII character set, consisting of position codes 33 - 126, like for a standard 94-character charset) and Control-ASCII (the non-printing characters that would appear in a binary file with codes 0 - 32 and 127). Control-1 contains the non-printing characters that would appear in a binary file with codes 128 - 159. Composite contains characters that are generated by overstriking one or more characters from other charsets. Note that some characters in ASCII, and all characters in Control-1, are "control" (non-printing) characters. These have no printed representation but instead control some other function of the printing (e.g. TAB or 8 moves the current character position to the next tab stop). All other characters in all charsets are "graphic" (printing) characters. When a binary file is read in, the bytes in the file are assigned to character sets as follows: Bytes Character set Range -------------------------------------------------- 0 - 127 ASCII 0 - 127 128 - 159 Control-1 0 - 31 160 - 255 Latin-1 32 - 127 This is a bit ad-hoc but gets the job done.  File: internals.info, Node: Encodings, Next: Internal Mule Encodings, Prev: Character Sets, Up: MULE Character Sets and Encodings Encodings ========= An "encoding" is a way of numerically representing characters from one or more character sets. If an encoding only encompasses one character set, then the position codes for the characters in that character set could be used directly. This is not possible, however, if more than one character set is to be used in the encoding. For example, the conversion detailed above between bytes in a binary file and characters is effectively an encoding that encompasses the three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit bytes. Thus, an encoding can be viewed as a way of encoding characters from a specified group of character sets using a stream of bytes, each of which contains a fixed number of bits (but not necessarily 8, as in the common usage of "byte"). Here are descriptions of a couple of common encodings: * Menu: * Japanese EUC (Extended Unix Code):: * JIS7::  File: internals.info, Node: Japanese EUC (Extended Unix Code), Next: JIS7, Up: Encodings Japanese EUC (Extended Unix Code) --------------------------------- This encompasses the character sets Printing-ASCII, Japanese-JISX0201, and Japanese-JISX0208-Kana (half-width katakana, the right half of JISX0201). It uses 8-bit bytes. Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character charsets, while Japanese-JISX0208 is a 94x94-character charset. The encoding is as follows: Character set Representation (PC=position-code) ------------- -------------- Printing-ASCII PC1 Japanese-JISX0201-Kana 0x8E | PC1 + 0x80 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80  File: internals.info, Node: JIS7, Prev: Japanese EUC (Extended Unix Code), Up: Encodings JIS7 ---- This encompasses the character sets Printing-ASCII, Japanese-JISX0201-Roman (the left half of JISX0201; this character set is very similar to Printing-ASCII and is a 94-character charset), Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes. Unlike Japanese EUC, this is a "modal" encoding, which means that there are multiple states that the encoding can be in, which affect how the bytes are to be interpreted. Special sequences of bytes (called "escape sequences") are used to change states. The encoding is as follows: Character set Representation (PC=position-code) ------------- -------------- Printing-ASCII PC1 Japanese-JISX0201-Roman PC1 Japanese-JISX0201-Kana PC1 Japanese-JISX0208 PC1 PC2 Escape sequence ASCII equivalent Meaning --------------- ---------------- ------- 0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman 0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII Initially, Printing-ASCII is invoked.  File: internals.info, Node: Internal Mule Encodings, Next: CCL, Prev: Encodings, Up: MULE Character Sets and Encodings Internal Mule Encodings ======================= In XEmacs/Mule, each character set is assigned a unique number, called a "leading byte". This is used in the encodings of a character. Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has a leading byte of 0), although some leading bytes are reserved. Charsets whose leading byte is in the range 0x80 - 0x9F are called "official" and are used for built-in charsets. Other charsets are called "private" and have leading bytes in the range 0xA0 - 0xFF; these are user-defined charsets. More specifically: Character set Leading byte ------------- ------------ ASCII 0 Composite 0x80 Dimension-1 Official 0x81 - 0x8D (0x8E is free) Control-1 0x8F Dimension-2 Official 0x90 - 0x99 (0x9A - 0x9D are free; 0x9E and 0x9F are reserved) Dimension-1 Private 0xA0 - 0xEF Dimension-2 Private 0xF0 - 0xFF There are two internal encodings for characters in XEmacs/Mule. One is called "string encoding" and is an 8-bit encoding that is used for representing characters in a buffer or string. It uses 1 to 4 bytes per character. The other is called "character encoding" and is a 19-bit encoding that is used for representing characters individually in a variable. (In the following descriptions, we'll ignore composite characters for the moment. We also give a general (structural) overview first, followed later by the exact details.) * Menu: * Internal String Encoding:: * Internal Character Encoding::  File: internals.info, Node: Internal String Encoding, Next: Internal Character Encoding, Up: Internal Mule Encodings Internal String Encoding ------------------------ ASCII characters are encoded using their position code directly. Other characters are encoded using their leading byte followed by their position code(s) with the high bit set. Characters in private character sets have their leading byte prefixed with a "leading byte prefix", which is either 0x9E or 0x9F. (No character sets are ever assigned these leading bytes.) Specifically: Character set Encoding (PC=position-code, LB=leading-byte) ------------- -------- ASCII PC-1 | Control-1 LB | PC1 + 0xA0 | Dimension-1 official LB | PC1 + 0x80 | Dimension-1 private 0x9E | LB | PC1 + 0x80 | Dimension-2 official LB | PC1 + 0x80 | PC2 + 0x80 | Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80 The basic characteristic of this encoding is that the first byte of all characters is in the range 0x00 - 0x9F, and the second and following bytes of all characters is in the range 0xA0 - 0xFF. This means that it is impossible to get out of sync, or more specifically: 1. Given any byte position, the beginning of the character it is within can be determined in constant time. 2. Given any byte position at the beginning of a character, the beginning of the next character can be determined in constant time. 3. Given any byte position at the beginning of a character, the beginning of the previous character can be determined in constant time. 4. Textual searches can simply treat encoded strings as if they were encoded in a one-byte-per-character fashion rather than the actual multi-byte encoding. None of the standard non-modal encodings meet all of these conditions. For example, EUC satisfies only (2) and (3), while Shift-JIS and Big5 (not yet described) satisfy only (2). (All non-modal encodings must satisfy (2), in order to be unambiguous.)  File: internals.info, Node: Internal Character Encoding, Prev: Internal String Encoding, Up: Internal Mule Encodings Internal Character Encoding --------------------------- One 19-bit word represents a single character. The word is separated into three fields: Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 <------------> <------------------> <------------------> Field: 1 2 3 Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5 bits. Character set Field 1 Field 2 Field 3 ------------- ------- ------- ------- ASCII 0 0 PC1 range: (00 - 7F) Control-1 0 1 PC1 range: (00 - 1F) Dimension-1 official 0 LB - 0x80 PC1 range: (01 - 0D) (20 - 7F) Dimension-1 private 0 LB - 0x80 PC1 range: (20 - 6F) (20 - 7F) Dimension-2 official LB - 0x8F PC1 PC2 range: (01 - 0A) (20 - 7F) (20 - 7F) Dimension-2 private LB - 0xE1 PC1 PC2 range: (0F - 1E) (20 - 7F) (20 - 7F) Composite 0x1F ? ? Note that character codes 0 - 255 are the same as the "binary encoding" described above.  File: internals.info, Node: CCL, Prev: Internal Mule Encodings, Up: MULE Character Sets and Encodings CCL === CCL PROGRAM SYNTAX: CCL_PROGRAM := (CCL_MAIN_BLOCK [ CCL_EOF_BLOCK ]) CCL_MAIN_BLOCK := CCL_BLOCK CCL_EOF_BLOCK := CCL_BLOCK CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...]) STATEMENT := SET | IF | BRANCH | LOOP | REPEAT | BREAK | READ | WRITE SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION) | INT-OR-CHAR EXPRESSION := ARG | (EXPRESSION OP ARG) IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK) BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...]) LOOP := (loop STATEMENT [STATEMENT ...]) BREAK := (break) REPEAT := (repeat) | (write-repeat [REG | INT-OR-CHAR | string]) | (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?) READ := (read REG) | (read REG REG) | (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK) | (read-branch REG CCL_BLOCK [CCL_BLOCK ...]) WRITE := (write REG) | (write REG REG) | (write INT-OR-CHAR) | (write STRING) | STRING | (write REG ARRAY) END := (end) REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 ARG := REG | INT-OR-CHAR OP := + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | // | < | > | == | <= | >= | != SELF_OP := += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>= ARRAY := '[' INT-OR-CHAR ... ']' INT-OR-CHAR := INT | CHAR MACHINE CODE: The machine code consists of a vector of 32-bit words. The first such word specifies the start of the EOF section of the code; this is the code executed to handle any stuff that needs to be done (e.g. designating back to ASCII and left-to-right mode) after all other encoded/decoded data has been written out. This is not used for charset CCL programs. REGISTER: 0..7 -- refered by RRR or rrr OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT TTTTT (5-bit): operator type RRR (3-bit): register number XXXXXXXXXXXXXXXX (15-bit): CCCCCCCCCCCCCCC: constant or address 000000000000rrr: register number AAAA: 00000 + 00001 - 00010 * 00011 / 00100 % 00101 & 00110 | 00111 ~ 01000 << 01001 >> 01010 <8 01011 >8 01100 // 01101 not used 01110 not used 01111 not used 10000 < 10001 > 10010 == 10011 <= 10100 >= 10101 != OPERATORS: TTTTT RRR XX.. SetCS: 00000 RRR C...C RRR = C...C SetCL: 00001 RRR ..... RRR = c...c c.............c SetR: 00010 RRR ..rrr RRR = rrr SetA: 00011 RRR ..rrr RRR = array[rrr] C.............C size of array = C...C c.............c contents = c...c Jump: 00100 000 c...c jump to c...c JumpCond: 00101 RRR c...c if (!RRR) jump to c...c WriteJump: 00110 RRR c...c Write1 RRR, jump to c...c WriteReadJump: 00111 RRR c...c Write1, Read1 RRR, jump to c...c WriteCJump: 01000 000 c...c Write1 C...C, jump to c...c C...C WriteCReadJump: 01001 RRR c...c Write1 C...C, Read1 RRR, C.............C and jump to c...c WriteSJump: 01010 000 c...c WriteS, jump to c...c C.............C S.............S ... WriteSReadJump: 01011 RRR c...c WriteS, Read1 RRR, jump to c...c C.............C S.............S ... WriteAReadJump: 01100 RRR c...c WriteA, Read1 RRR, jump to c...c C.............C size of array = C...C c.............c contents = c...c ... Branch: 01101 RRR C...C if (RRR >= 0 && RRR < C..) c.............c branch to (RRR+1)th address Read1: 01110 RRR ... read 1-byte to RRR Read2: 01111 RRR ..rrr read 2-byte to RRR and rrr ReadBranch: 10000 RRR C...C Read1 and Branch c.............c ... Write1: 10001 RRR ..... write 1-byte RRR Write2: 10010 RRR ..rrr write 2-byte RRR and rrr WriteC: 10011 000 ..... write 1-char C...CC C.............C WriteS: 10100 000 ..... write C..-byte of string C.............C S.............S ... WriteA: 10101 RRR ..... write array[RRR] C.............C size of array = C...C c.............c contents = c...c ... End: 10110 000 ..... terminate the execution SetSelfCS: 10111 RRR C...C RRR AAAAA= C...C ..........AAAAA SetSelfCL: 11000 RRR ..... RRR AAAAA= c...c c.............c ..........AAAAA SetSelfR: 11001 RRR ..Rrr RRR AAAAA= rrr ..........AAAAA SetExprCL: 11010 RRR ..Rrr RRR = rrr AAAAA c...c c.............c ..........AAAAA SetExprR: 11011 RRR ..rrr RRR = rrr AAAAA Rrr ............Rrr ..........AAAAA JumpCondC: 11100 RRR c...c if !(RRR AAAAA C..) jump to c...c C.............C ..........AAAAA JumpCondR: 11101 RRR c...c if !(RRR AAAAA rrr) jump to c...c ............rrr ..........AAAAA ReadJumpCondC: 11110 RRR c...c Read1 and JumpCondC C.............C ..........AAAAA ReadJumpCondR: 11111 RRR c...c Read1 and JumpCondR ............rrr ..........AAAAA  File: internals.info, Node: The Lisp Reader and Compiler, Next: Lstreams, Prev: MULE Character Sets and Encodings, Up: Top The Lisp Reader and Compiler **************************** Not yet documented.  File: internals.info, Node: Lstreams, Next: Consoles; Devices; Frames; Windows, Prev: The Lisp Reader and Compiler, Up: Top Lstreams ******** An "lstream" is an internal Lisp object that provides a generic buffering stream implementation. Conceptually, you send data to the stream or read data from the stream, not caring what's on the other end of the stream. The other end could be another stream, a file descriptor, a stdio stream, a fixed block of memory, a reallocating block of memory, etc. The main purpose of the stream is to provide a standard interface and to do buffering. Macros are defined to read or write characters, so the calling functions do not have to worry about blocking data together in order to achieve efficiency. * Menu: * Creating an Lstream:: Creating an lstream object. * Lstream Types:: Different sorts of things that are streamed. * Lstream Functions:: Functions for working with lstreams. * Lstream Methods:: Creating new lstream types.  File: internals.info, Node: Creating an Lstream, Next: Lstream Types, Up: Lstreams Creating an Lstream =================== Lstreams come in different types, depending on what is being interfaced to. Although the primitive for creating new lstreams is `Lstream_new()', generally you do not call this directly. Instead, you call some type-specific creation function, which creates the lstream and initializes it as appropriate for the particular type. All lstream creation functions take a MODE argument, specifying what mode the lstream should be opened as. This controls whether the lstream is for input and output, and optionally whether data should be blocked up in units of MULE characters. Note that some types of lstreams can only be opened for input; others only for output; and others can be opened either way. #### Richard Mlynarik thinks that there should be a strict separation between input and output streams, and he's probably right. MODE is a string, one of `"r"' Open for reading. `"w"' Open for writing. `"rc"' Open for reading, but "read" never returns partial MULE characters. `"wc"' Open for writing, but never writes partial MULE characters.  File: internals.info, Node: Lstream Types, Next: Lstream Functions, Prev: Creating an Lstream, Up: Lstreams Lstream Types ============= stdio filedesc lisp-string fixed-buffer resizing-buffer dynarr lisp-buffer print decoding encoding  File: internals.info, Node: Lstream Functions, Next: Lstream Methods, Prev: Lstream Types, Up: Lstreams Lstream Functions ================= - Function: Lstream * Lstream_new (Lstream_implementation *IMP, CONST char *MODE) Allocate and return a new Lstream. This function is not really meant to be called directly; rather, each stream type should provide its own stream creation function, which creates the stream and does any other necessary creation stuff (e.g. opening a file). - Function: void Lstream_set_buffering (Lstream *LSTR, Lstream_buffering BUFFERING, int BUFFERING_SIZE) Change the buffering of a stream. See `lstream.h'. By default the buffering is `STREAM_BLOCK_BUFFERED'. - Function: int Lstream_flush (Lstream *LSTR) Flush out any pending unwritten data in the stream. Clear any buffered input data. Returns 0 on success, -1 on error. - Macro: int Lstream_putc (Lstream *STREAM, int C) Write out one byte to the stream. This is a macro and so it is very efficient. The C argument is only evaluated once but the STREAM argument is evaluated more than once. Returns 0 on success, -1 on error. - Macro: int Lstream_getc (Lstream *STREAM) Read one byte from the stream. This is a macro and so it is very efficient. The STREAM argument is evaluated more than once. Return value is -1 for EOF or error. - Macro: void Lstream_ungetc (Lstream *STREAM, int C) Push one byte back onto the input queue. This will be the next byte read from the stream. Any number of bytes can be pushed back and will be read in the reverse order they were pushed back - most recent first. (This is necessary for consistency - if there are a number of bytes that have been unread and I read and unread a byte, it needs to be the first to be read again.) This is a macro and so it is very efficient. The C argument is only evaluated once but the STREAM argument is evaluated more than once. - Function: int Lstream_fputc (Lstream *STREAM, int C) - Function: int Lstream_fgetc (Lstream *STREAM) - Function: void Lstream_fungetc (Lstream *STREAM, int C) Function equivalents of the above macros. - Function: int Lstream_read (Lstream *STREAM, void *DATA, int SIZE) Read SIZE bytes of DATA from the stream. Return the number of bytes read. 0 means EOF. -1 means an error occurred and no bytes were read. - Function: int Lstream_write (Lstream *STREAM, void *DATA, int SIZE) Write SIZE bytes of DATA to the stream. Return the number of bytes written. -1 means an error occurred and no bytes were written. - Function: void Lstream_unread (Lstream *STREAM, void *DATA, int SIZE) Push back SIZE bytes of DATA onto the input queue. The next call to `Lstream_read()' with the same size will read the same bytes back. Note that this will be the case even if there is other pending unread data. - Function: int Lstream_close (Lstream *STREAM) Close the stream. All data will be flushed out. - Function: void Lstream_reopen (Lstream *STREAM) Reopen a closed stream. This enables I/O on it again. This is not meant to be called except from a wrapper routine that reinitializes variables and such - the close routine may well have freed some necessary storage structures, for example. - Function: void Lstream_rewind (Lstream *STREAM) Rewind the stream to the beginning.  File: internals.info, Node: Lstream Methods, Prev: Lstream Functions, Up: Lstreams Lstream Methods =============== - Lstream Method: int reader (Lstream *STREAM, unsigned char *DATA, int SIZE) Read some data from the stream's end and store it into DATA, which can hold SIZE bytes. Return the number of bytes read. A return value of 0 means no bytes can be read at this time. This may be because of an EOF, or because there is a granularity greater than one byte that the stream imposes on the returned data, and SIZE is less than this granularity. (This will happen frequently for streams that need to return whole characters, because `Lstream_read()' calls the reader function repeatedly until it has the number of bytes it wants or until 0 is returned.) The lstream functions do not treat a 0 return as EOF or do anything special; however, the calling function will interpret any 0 it gets back as EOF. This will normally not happen unless the caller calls `Lstream_read()' with a very small size. This function can be `NULL' if the stream is output-only. - Lstream Method: int writer (Lstream *STREAM, CONST unsigned char *DATA, int SIZE) Send some data to the stream's end. Data to be sent is in DATA and is SIZE bytes. Return the number of bytes sent. This function can send and return fewer bytes than is passed in; in that case, the function will just be called again until there is no data left or 0 is returned. A return value of 0 means that no more data can be currently stored, but there is no error; the data will be squirreled away until the writer can accept data. (This is useful, e.g., if you're dealing with a non-blocking file descriptor and are getting `EWOULDBLOCK' errors.) This function can be `NULL' if the stream is input-only. - Lstream Method: int rewinder (Lstream *STREAM) Rewind the stream. If this is `NULL', the stream is not seekable. - Lstream Method: int seekable_p (Lstream *STREAM) Indicate whether this stream is seekable - i.e. it can be rewound. This method is ignored if the stream does not have a rewind method. If this method is not present, the result is determined by whether a rewind method is present. - Lstream Method: int flusher (Lstream *STREAM) Perform any additional operations necessary to flush the data in this stream. - Lstream Method: int pseudo_closer (Lstream *STREAM) - Lstream Method: int closer (Lstream *STREAM) Perform any additional operations necessary to close this stream down. May be `NULL'. This function is called when `Lstream_close()' is called or when the stream is garbage-collected. When this function is called, all pending data in the stream will already have been written out. - Lstream Method: Lisp_Object marker (Lisp_Object LSTREAM, void (*MARKFUN) (Lisp_Object)) Mark this object for garbage collection. Same semantics as a standard `Lisp_Object' marker. This function can be `NULL'.  File: internals.info, Node: Consoles; Devices; Frames; Windows, Next: The Redisplay Mechanism, Prev: Lstreams, Up: Top Consoles; Devices; Frames; Windows ********************************** * Menu: * Introduction to Consoles; Devices; Frames; Windows:: * Point:: * Window Hierarchy:: * The Window Object::  File: internals.info, Node: Introduction to Consoles; Devices; Frames; Windows, Next: Point, Up: Consoles; Devices; Frames; Windows Introduction to Consoles; Devices; Frames; Windows ================================================== A window-system window that you see on the screen is called a "frame" in Emacs terminology. Each frame is subdivided into one or more non-overlapping panes, called (confusingly) "windows". Each window displays the text of a buffer in it. (See above on Buffers.) Note that buffers and windows are independent entities: Two or more windows can be displaying the same buffer (potentially in different locations), and a buffer can be displayed in no windows. A single display screen that contains one or more frames is called a "display". Under most circumstances, there is only one display. However, more than one display can exist, for example if you have a "multi-headed" console, i.e. one with a single keyboard but multiple displays. (Typically in such a situation, the various displays act like one large display, in that the mouse is only in one of them at a time, and moving the mouse off of one moves it into another.) In some cases, the different displays will have different characteristics, e.g. one color and one mono. XEmacs can display frames on multiple displays. It can even deal simultaneously with frames on multiple keyboards (called "consoles" in XEmacs terminology). Here is one case where this might be useful: You are using XEmacs on your workstation at work, and leave it running. Then you go home and dial in on a TTY line, and you can use the already-running XEmacs process to display another frame on your local TTY. Thus, there is a hierarchy console -> display -> frame -> window. There is a separate Lisp object type for each of these four concepts. Furthermore, there is logically a "selected console", "selected display", "selected frame", and "selected window". Each of these objects is distinguished in various ways, such as being the default object for various functions that act on objects of that type. Note that every containing object rememembers the "selected" object among the objects that it contains: e.g. not only is there a selected window, but every frame remembers the last window in it that was selected, and changing the selected frame causes the remembered window within it to become the selected window. Similar relationships apply for consoles to devices and devices to frames.  File: internals.info, Node: Point, Next: Window Hierarchy, Prev: Introduction to Consoles; Devices; Frames; Windows, Up: Consoles; Devices; Frames; Windows Point ===== Recall that every buffer has a current insertion position, called "point". Now, two or more windows may be displaying the same buffer, and the text cursor in the two windows (i.e. `point') can be in two different places. You may ask, how can that be, since each buffer has only one value of `point'? The answer is that each window also has a value of `point' that is squirreled away in it. There is only one selected window, and the value of "point" in that buffer corresponds to that window. When the selected window is changed from one window to another displaying the same buffer, the old value of `point' is stored into the old window's "point" and the value of `point' from the new window is retrieved and made the value of `point' in the buffer. This means that `window-point' for the selected window is potentially inaccurate, and if you want to retrieve the correct value of `point' for a window, you must special-case on the selected window and retrieve the buffer's point instead. This is related to why `save-window-excursion' does not save the selected window's value of `point'.  File: internals.info, Node: Window Hierarchy, Next: The Window Object, Prev: Point, Up: Consoles; Devices; Frames; Windows Window Hierarchy ================ If a frame contains multiple windows (panes), they are always created by splitting an existing window along the horizontal or vertical axis. Terminology is a bit confusing here: to "split a window horizontally" means to create two side-by-side windows, i.e. to make a *vertical* cut in a window. Likewise, to "split a window vertically" means to create two windows, one above the other, by making a *horizontal* cut. If you split a window and then split again along the same axis, you will end up with a number of panes all arranged along the same axis. The precise way in which the splits were made should not be important, and this is reflected internally. Internally, all windows are arranged in a tree, consisting of two types of windows, "combination" windows (which have children, and are covered completely by those children) and "leaf" windows, which have no children and are visible. Every combination window has two or more children, all arranged along the same axis. There are (logically) two subtypes of windows, depending on whether their children are horizontally or vertically arrayed. There is always one root window, which is either a leaf window (if the frame contains only one window) or a combination window (if the frame contains more than one window). In the latter case, the root window will have two or more children, either horizontally or vertically arrayed, and each of those children will be either a leaf window or another combination window. Here are some rules: 1. Horizontal combination windows can never have children that are horizontal combination windows; same for vertical. 2. Only leaf windows can be split (obviously) and this splitting does one of two things: (a) turns the leaf window into a combination window and creates two new leaf children, or (b) turns the leaf window into one of the two new leaves and creates the other leaf. Rule (1) dictates which of these two outcomes happens. 3. Every combination window must have at least two children. 4. Leaf windows can never become combination windows. They can be deleted, however. If this results in a violation of (3), the parent combination window also gets deleted. 5. All functions that accept windows must be prepared to accept combination windows, and do something sane (e.g. signal an error if so). Combination windows *do* escape to the Lisp level. 6. All windows have three fields governing their contents: these are "hchild" (a list of horizontally-arrayed children), "vchild" (a list of vertically-arrayed children), and "buffer" (the buffer contained in a leaf window). Exactly one of these will be non-nil. Remember that "horizontally-arrayed" means "side-by-side" and "vertically-arrayed" means "one above the other". 7. Leaf windows also have markers in their `start' (the first buffer position displayed in the window) and `pointm' (the window's stashed value of `point' - see above) fields, while combination windows have nil in these fields. 8. The list of children for a window is threaded through the `next' and `prev' fields of each child window. 9. *Deleted windows can be undeleted*. This happens as a result of restoring a window configuration, and is unlike frames, displays, and consoles, which, once deleted, can never be restored. Deleting a window does nothing except set a special `dead' bit to 1 and clear out the `next', `prev', `hchild', and `vchild' fields, for GC purposes. 10. Most frames actually have two top-level windows - one for the minibuffer and one (the "root") for everything else. The modeline (if present) separates these two. The `next' field of the root points to the minibuffer, and the `prev' field of the minibuffer points to the root. The other `next' and `prev' fields are `nil', and the frame points to both of these windows. Minibuffer-less frames have no minibuffer window, and the `next' and `prev' of the root window are `nil'. Minibuffer-only frames have no root window, and the `next' of the minibuffer window is `nil' but the `prev' points to itself. (#### This is an artifact that should be fixed.)  File: internals.info, Node: The Window Object, Prev: Window Hierarchy, Up: Consoles; Devices; Frames; Windows The Window Object ================= Windows have the following accessible fields: `frame' The frame that this window is on. `mini_p' Non-`nil' if this window is a minibuffer window. `buffer' The buffer that the window is displaying. This may change often during the life of the window. `dedicated' Non-`nil' if this window is dedicated to its buffer. `pointm' This is the value of point in the current buffer when this window is selected; when it is not selected, it retains its previous value. `start' The position in the buffer that is the first character to be displayed in the window. `force_start' If this flag is non-`nil', it says that the window has been scrolled explicitly by the Lisp program. This affects what the next redisplay does if point is off the screen: instead of scrolling the window to show the text around point, it moves point to a location that is on the screen. `last_modified' The `modified' field of the window's buffer, as of the last time a redisplay completed in this window. `last_point' The buffer's value of point, as of the last time a redisplay completed in this window. `left' This is the left-hand edge of the window, measured in columns. (The leftmost column on the screen is column 0.) `top' This is the top edge of the window, measured in lines. (The top line on the screen is line 0.) `height' The height of the window, measured in lines. `width' The width of the window, measured in columns. `next' This is the window that is the next in the chain of siblings. It is `nil' in a window that is the rightmost or bottommost of a group of siblings. `prev' This is the window that is the previous in the chain of siblings. It is `nil' in a window that is the leftmost or topmost of a group of siblings. `parent' Internally, XEmacs arranges windows in a tree; each group of siblings has a parent window whose area includes all the siblings. This field points to a window's parent. Parent windows do not display buffers, and play little role in display except to shape their child windows. Emacs Lisp programs usually have no access to the parent windows; they operate on the windows at the leaves of the tree, which actually display buffers. `hscroll' This is the number of columns that the display in the window is scrolled horizontally to the left. Normally, this is 0. `use_time' This is the last time that the window was selected. The function `get-lru-window' uses this field. `display_table' The window's display table, or `nil' if none is specified for it. `update_mode_line' Non-`nil' means this window's mode line needs to be updated. `base_line_number' The line number of a certain position in the buffer, or `nil'. This is used for displaying the line number of point in the mode line. `base_line_pos' The position in the buffer for which the line number is known, or `nil' meaning none is known. `region_showing' If the region (or part of it) is highlighted in this window, this field holds the mark position that made one end of that region. Otherwise, this field is `nil'.