1 This is Info file ../../info/internals.info, produced by Makeinfo
2 version 1.68 from the input file internals.texi.
4 INFO-DIR-SECTION XEmacs Editor
6 * Internals: (internals). XEmacs Internals Manual.
9 Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun
10 Microsystems. Copyright (C) 1994 - 1998 Free Software Foundation.
11 Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
13 Permission is granted to make and distribute verbatim copies of this
14 manual provided the copyright notice and this permission notice are
15 preserved on all copies.
17 Permission is granted to copy and distribute modified versions of
18 this manual under the conditions for verbatim copying, provided that the
19 entire resulting derived work is distributed under the terms of a
20 permission notice identical to this one.
22 Permission is granted to copy and distribute translations of this
23 manual into another language, under the above conditions for modified
24 versions, except that this permission notice may be stated in a
25 translation approved by the Foundation.
27 Permission is granted to copy and distribute modified versions of
28 this manual under the conditions for verbatim copying, provided also
29 that the section entitled "GNU General Public License" is included
30 exactly as in the original, and provided that the entire resulting
31 derived work is distributed under the terms of a permission notice
32 identical to this one.
34 Permission is granted to copy and distribute translations of this
35 manual into another language, under the above conditions for modified
36 versions, except that the section entitled "GNU General Public License"
37 may be included in a translation approved by the Free Software
38 Foundation instead of in the original English.
41 File: internals.info, Node: Buffer Lists, Next: Markers and Extents, Prev: The Text in a Buffer, Up: Buffers and Textual Representation
46 Recall earlier that buffers are "permanent" objects, i.e. that they
47 remain around until explicitly deleted. This entails that there is a
48 list of all the buffers in existence. This list is actually an
49 assoc-list (mapping from the buffer's name to the buffer) and is stored
50 in the global variable `Vbuffer_alist'.
52 The order of the buffers in the list is important: the buffers are
53 ordered approximately from most-recently-used to least-recently-used.
54 Switching to a buffer using `switch-to-buffer', `pop-to-buffer', etc.
55 and switching windows using `other-window', etc. usually brings the
56 new current buffer to the front of the list. `switch-to-buffer',
57 `other-buffer', etc. look at the beginning of the list to find an
58 alternative buffer to suggest. You can also explicitly move a buffer
59 to the end of the list using `bury-buffer'.
61 In addition to the global ordering in `Vbuffer_alist', each frame
62 has its own ordering of the list. These lists always contain the same
63 elements as in `Vbuffer_alist' although possibly in a different order.
64 `buffer-list' normally returns the list for the selected frame. This
65 allows you to work in separate frames without things interfering with
68 The standard way to look up a buffer given a name is `get-buffer',
69 and the standard way to create a new buffer is `get-buffer-create',
70 which looks up a buffer with a given name, creating a new one if
71 necessary. These operations correspond exactly with the symbol
72 operations `intern-soft' and `intern', respectively. You can also
73 force a new buffer to be created using `generate-new-buffer', which
74 takes a name and (if necessary) makes a unique name from this by
75 appending a number, and then creates the buffer. This is basically
76 like the symbol operation `gensym'.
79 File: internals.info, Node: Markers and Extents, Next: Bufbytes and Emchars, Prev: Buffer Lists, Up: Buffers and Textual Representation
84 Among the things associated with a buffer are things that are
85 logically attached to certain buffer positions. This can be used to
86 keep track of a buffer position when text is inserted and deleted, so
87 that it remains at the same spot relative to the text around it; to
88 assign properties to particular sections of text; etc. There are two
89 such objects that are useful in this regard: they are "markers" and
92 A "marker" is simply a flag placed at a particular buffer position,
93 which is moved around as text is inserted and deleted. Markers are
94 used for all sorts of purposes, such as the `mark' that is the other
95 end of textual regions to be cut, copied, etc.
97 An "extent" is similar to two markers plus some associated
98 properties, and is used to keep track of regions in a buffer as text is
99 inserted and deleted, and to add properties (e.g. fonts) to particular
100 regions of text. The external interface of extents is explained
103 The important thing here is that markers and extents simply contain
104 buffer positions in them as integers, and every time text is inserted or
105 deleted, these positions must be updated. In order to minimize the
106 amount of shuffling that needs to be done, the positions in markers and
107 extents (there's one per marker, two per extent) and stored in Meminds.
108 This means that they only need to be moved when the text is physically
109 moved in memory; since the gap structure tries to minimize this, it also
110 minimizes the number of marker and extent indices that need to be
111 adjusted. Look in `insdel.c' for the details of how this works.
113 One other important distinction is that markers are "temporary"
114 while extents are "permanent". This means that markers disappear as
115 soon as there are no more pointers to them, and correspondingly, there
116 is no way to determine what markers are in a buffer if you are just
117 given the buffer. Extents remain in a buffer until they are detached
118 (which could happen as a result of text being deleted) or the buffer is
119 deleted, and primitives do exist to enumerate the extents in a buffer.
122 File: internals.info, Node: Bufbytes and Emchars, Next: The Buffer Object, Prev: Markers and Extents, Up: Buffers and Textual Representation
130 File: internals.info, Node: The Buffer Object, Prev: Bufbytes and Emchars, Up: Buffers and Textual Representation
135 Buffers contain fields not directly accessible by the Lisp
136 programmer. We describe them here, naming them by the names used in
137 the C code. Many are accessible indirectly in Lisp programs via Lisp
141 The buffer name is a string that names the buffer. It is
142 guaranteed to be unique. *Note Buffer Names: (lispref)Buffer
146 This field contains the time when the buffer was last saved, as an
147 integer. *Note Buffer Modification: (lispref)Buffer Modification.
150 This field contains the modification time of the visited file. It
151 is set when the file is written or read. Every time the buffer is
152 written to the file, this field is compared to the modification
153 time of the file. *Note Buffer Modification: (lispref)Buffer
157 This field contains the time when the buffer was last auto-saved.
160 This field contains the `window-start' position in the buffer as of
161 the last time the buffer was displayed in a window.
164 This field points to the buffer's undo list. *Note Undo:
168 This field contains the syntax table for the buffer. *Note Syntax
169 Tables: (lispref)Syntax Tables.
172 This field contains the conversion table for converting text to
173 lower case. *Note Case Tables: (lispref)Case Tables.
176 This field contains the conversion table for converting text to
177 upper case. *Note Case Tables: (lispref)Case Tables.
180 This field contains the conversion table for canonicalizing text
181 for case-folding search. *Note Case Tables: (lispref)Case Tables.
184 This field contains the equivalence table for case-folding search.
185 *Note Case Tables: (lispref)Case Tables.
188 This field contains the buffer's display table, or `nil' if it
189 doesn't have one. *Note Display Tables: (lispref)Display Tables.
192 This field contains the chain of all markers that currently point
193 into the buffer. Deletion of text in the buffer, and motion of
194 the buffer's gap, must check each of these markers and perhaps
195 update it. *Note Markers: (lispref)Markers.
198 This field is a flag that tells whether a backup file has been
199 made for the visited file of this buffer.
202 This field contains the mark for the buffer. The mark is a marker,
203 hence it is also included on the list `markers'. *Note The Mark:
207 This field is non-`nil' if the buffer's mark is active.
210 This field contains the association list describing the variables
211 local in this buffer, and their values, with the exception of
212 local variables that have special slots in the buffer object.
213 (Those slots are omitted from this table.) *Note Buffer-Local
214 Variables: (lispref)Buffer-Local Variables.
217 This field contains a Lisp object which controls how to display
218 the mode line for this buffer. *Note Modeline Format:
219 (lispref)Modeline Format.
222 This field holds the buffer's base buffer (if it is an indirect
226 File: internals.info, Node: MULE Character Sets and Encodings, Next: The Lisp Reader and Compiler, Prev: Buffers and Textual Representation, Up: Top
228 MULE Character Sets and Encodings
229 *********************************
231 Recall that there are two primary ways that text is represented in
232 XEmacs. The "buffer" representation sees the text as a series of bytes
233 (Bufbytes), with a variable number of bytes used per character. The
234 "character" representation sees the text as a series of integers
235 (Emchars), one per character. The character representation is a cleaner
236 representation from a theoretical standpoint, and is thus used in many
237 cases when lots of manipulations on a string need to be done. However,
238 the buffer representation is the standard representation used in both
239 Lisp strings and buffers, and because of this, it is the "default"
240 representation that text comes in. The reason for using this
241 representation is that it's compact and is compatible with ASCII.
247 * Internal Mule Encodings::
251 File: internals.info, Node: Character Sets, Next: Encodings, Up: MULE Character Sets and Encodings
256 A character set (or "charset") is an ordered set of characters. A
257 particular character in a charset is indexed using one or more
258 "position codes", which are non-negative integers. The number of
259 position codes needed to identify a particular character in a charset is
260 called the "dimension" of the charset. In XEmacs/Mule, all charsets
261 have dimension 1 or 2, and the size of all charsets (except for a few
262 special cases) is either 94, 96, 94 by 94, or 96 by 96. The range of
263 position codes used to index characters from any of these types of
264 character sets is as follows:
266 Charset type Position code 1 Position code 2
267 ------------------------------------------------------------
270 94x94 33 - 126 33 - 126
271 96x96 32 - 127 32 - 127
273 Note that in the above cases position codes do not start at an
274 expected value such as 0 or 1. The reason for this will become clear
277 For example, Latin-1 is a 96-character charset, and JISX0208 (the
278 Japanese national character set) is a 94x94-character charset.
280 [Note that, although the ranges above define the *valid* position
281 codes for a charset, some of the slots in a particular charset may in
282 fact be empty. This is the case for JISX0208, for example, where (e.g.)
283 all the slots whose first position code is in the range 118 - 127 are
286 There are three charsets that do not follow the above rules. All of
287 them have one dimension, and have ranges of position codes as follows:
289 Charset name Position code 1
290 ------------------------------------
293 Composite 0 - some large number
295 (The upper bound of the position code for composite characters has
296 not yet been determined, but it will probably be at least 16,383).
298 ASCII is the union of two subsidiary character sets: Printing-ASCII
299 (the printing ASCII character set, consisting of position codes 33 -
300 126, like for a standard 94-character charset) and Control-ASCII (the
301 non-printing characters that would appear in a binary file with codes 0
304 Control-1 contains the non-printing characters that would appear in a
305 binary file with codes 128 - 159.
307 Composite contains characters that are generated by overstriking one
308 or more characters from other charsets.
310 Note that some characters in ASCII, and all characters in Control-1,
311 are "control" (non-printing) characters. These have no printed
312 representation but instead control some other function of the printing
313 (e.g. TAB or 8 moves the current character position to the next tab
314 stop). All other characters in all charsets are "graphic" (printing)
317 When a binary file is read in, the bytes in the file are assigned to
318 character sets as follows:
320 Bytes Character set Range
321 --------------------------------------------------
322 0 - 127 ASCII 0 - 127
323 128 - 159 Control-1 0 - 31
324 160 - 255 Latin-1 32 - 127
326 This is a bit ad-hoc but gets the job done.
329 File: internals.info, Node: Encodings, Next: Internal Mule Encodings, Prev: Character Sets, Up: MULE Character Sets and Encodings
334 An "encoding" is a way of numerically representing characters from
335 one or more character sets. If an encoding only encompasses one
336 character set, then the position codes for the characters in that
337 character set could be used directly. This is not possible, however, if
338 more than one character set is to be used in the encoding.
340 For example, the conversion detailed above between bytes in a binary
341 file and characters is effectively an encoding that encompasses the
342 three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit
345 Thus, an encoding can be viewed as a way of encoding characters from
346 a specified group of character sets using a stream of bytes, each of
347 which contains a fixed number of bits (but not necessarily 8, as in the
348 common usage of "byte").
350 Here are descriptions of a couple of common encodings:
354 * Japanese EUC (Extended Unix Code)::
358 File: internals.info, Node: Japanese EUC (Extended Unix Code), Next: JIS7, Up: Encodings
360 Japanese EUC (Extended Unix Code)
361 ---------------------------------
363 This encompasses the character sets Printing-ASCII,
364 Japanese-JISX0201, and Japanese-JISX0208-Kana (half-width katakana, the
365 right half of JISX0201). It uses 8-bit bytes.
367 Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character
368 charsets, while Japanese-JISX0208 is a 94x94-character charset.
370 The encoding is as follows:
372 Character set Representation (PC=position-code)
373 ------------- --------------
375 Japanese-JISX0201-Kana 0x8E | PC1 + 0x80
376 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
377 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80
380 File: internals.info, Node: JIS7, Prev: Japanese EUC (Extended Unix Code), Up: Encodings
385 This encompasses the character sets Printing-ASCII,
386 Japanese-JISX0201-Roman (the left half of JISX0201; this character set
387 is very similar to Printing-ASCII and is a 94-character charset),
388 Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes.
390 Unlike Japanese EUC, this is a "modal" encoding, which means that
391 there are multiple states that the encoding can be in, which affect how
392 the bytes are to be interpreted. Special sequences of bytes (called
393 "escape sequences") are used to change states.
395 The encoding is as follows:
397 Character set Representation (PC=position-code)
398 ------------- --------------
400 Japanese-JISX0201-Roman PC1
401 Japanese-JISX0201-Kana PC1
402 Japanese-JISX0208 PC1 PC2
405 Escape sequence ASCII equivalent Meaning
406 --------------- ---------------- -------
407 0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman
408 0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana
409 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208
410 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
412 Initially, Printing-ASCII is invoked.
415 File: internals.info, Node: Internal Mule Encodings, Next: CCL, Prev: Encodings, Up: MULE Character Sets and Encodings
417 Internal Mule Encodings
418 =======================
420 In XEmacs/Mule, each character set is assigned a unique number,
421 called a "leading byte". This is used in the encodings of a character.
422 Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has
423 a leading byte of 0), although some leading bytes are reserved.
425 Charsets whose leading byte is in the range 0x80 - 0x9F are called
426 "official" and are used for built-in charsets. Other charsets are
427 called "private" and have leading bytes in the range 0xA0 - 0xFF; these
428 are user-defined charsets.
432 Character set Leading byte
433 ------------- ------------
436 Dimension-1 Official 0x81 - 0x8D
439 Dimension-2 Official 0x90 - 0x99
440 (0x9A - 0x9D are free;
441 0x9E and 0x9F are reserved)
442 Dimension-1 Private 0xA0 - 0xEF
443 Dimension-2 Private 0xF0 - 0xFF
445 There are two internal encodings for characters in XEmacs/Mule. One
446 is called "string encoding" and is an 8-bit encoding that is used for
447 representing characters in a buffer or string. It uses 1 to 4 bytes per
448 character. The other is called "character encoding" and is a 19-bit
449 encoding that is used for representing characters individually in a
452 (In the following descriptions, we'll ignore composite characters for
453 the moment. We also give a general (structural) overview first,
454 followed later by the exact details.)
458 * Internal String Encoding::
459 * Internal Character Encoding::
462 File: internals.info, Node: Internal String Encoding, Next: Internal Character Encoding, Up: Internal Mule Encodings
464 Internal String Encoding
465 ------------------------
467 ASCII characters are encoded using their position code directly.
468 Other characters are encoded using their leading byte followed by their
469 position code(s) with the high bit set. Characters in private character
470 sets have their leading byte prefixed with a "leading byte prefix",
471 which is either 0x9E or 0x9F. (No character sets are ever assigned these
472 leading bytes.) Specifically:
474 Character set Encoding (PC=position-code, LB=leading-byte)
475 ------------- --------
477 Control-1 LB | PC1 + 0xA0 |
478 Dimension-1 official LB | PC1 + 0x80 |
479 Dimension-1 private 0x9E | LB | PC1 + 0x80 |
480 Dimension-2 official LB | PC1 + 0x80 | PC2 + 0x80 |
481 Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80
483 The basic characteristic of this encoding is that the first byte of
484 all characters is in the range 0x00 - 0x9F, and the second and
485 following bytes of all characters is in the range 0xA0 - 0xFF. This
486 means that it is impossible to get out of sync, or more specifically:
488 1. Given any byte position, the beginning of the character it is
489 within can be determined in constant time.
491 2. Given any byte position at the beginning of a character, the
492 beginning of the next character can be determined in constant time.
494 3. Given any byte position at the beginning of a character, the
495 beginning of the previous character can be determined in constant
498 4. Textual searches can simply treat encoded strings as if they were
499 encoded in a one-byte-per-character fashion rather than the actual
502 None of the standard non-modal encodings meet all of these
503 conditions. For example, EUC satisfies only (2) and (3), while
504 Shift-JIS and Big5 (not yet described) satisfy only (2). (All non-modal
505 encodings must satisfy (2), in order to be unambiguous.)
508 File: internals.info, Node: Internal Character Encoding, Prev: Internal String Encoding, Up: Internal Mule Encodings
510 Internal Character Encoding
511 ---------------------------
513 One 19-bit word represents a single character. The word is
514 separated into three fields:
516 Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
517 <------------> <------------------> <------------------>
520 Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5
523 Character set Field 1 Field 2 Field 3
524 ------------- ------- ------- -------
529 Dimension-1 official 0 LB - 0x80 PC1
530 range: (01 - 0D) (20 - 7F)
531 Dimension-1 private 0 LB - 0x80 PC1
532 range: (20 - 6F) (20 - 7F)
533 Dimension-2 official LB - 0x8F PC1 PC2
534 range: (01 - 0A) (20 - 7F) (20 - 7F)
535 Dimension-2 private LB - 0xE1 PC1 PC2
536 range: (0F - 1E) (20 - 7F) (20 - 7F)
539 Note that character codes 0 - 255 are the same as the "binary
540 encoding" described above.
543 File: internals.info, Node: CCL, Prev: Internal Mule Encodings, Up: MULE Character Sets and Encodings
549 CCL_PROGRAM := (CCL_MAIN_BLOCK
552 CCL_MAIN_BLOCK := CCL_BLOCK
553 CCL_EOF_BLOCK := CCL_BLOCK
555 CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...])
557 SET | IF | BRANCH | LOOP | REPEAT | BREAK
560 SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION)
563 EXPRESSION := ARG | (EXPRESSION OP ARG)
565 IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK)
566 BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
567 LOOP := (loop STATEMENT [STATEMENT ...])
570 | (write-repeat [REG | INT-OR-CHAR | string])
571 | (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?)
572 READ := (read REG) | (read REG REG)
573 | (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK)
574 | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
575 WRITE := (write REG) | (write REG REG)
576 | (write INT-OR-CHAR) | (write STRING) | STRING
580 REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
581 ARG := REG | INT-OR-CHAR
582 OP := + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
583 | < | > | == | <= | >= | !=
585 += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
586 ARRAY := '[' INT-OR-CHAR ... ']'
587 INT-OR-CHAR := INT | CHAR
591 The machine code consists of a vector of 32-bit words.
592 The first such word specifies the start of the EOF section of the code;
593 this is the code executed to handle any stuff that needs to be done
594 (e.g. designating back to ASCII and left-to-right mode) after all
595 other encoded/decoded data has been written out. This is not used for
596 charset CCL programs.
598 REGISTER: 0..7 -- refered by RRR or rrr
600 OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT
601 TTTTT (5-bit): operator type
602 RRR (3-bit): register number
603 XXXXXXXXXXXXXXXX (15-bit):
604 CCCCCCCCCCCCCCC: constant or address
605 000000000000rrr: register number
632 OPERATORS: TTTTT RRR XX..
634 SetCS: 00000 RRR C...C RRR = C...C
635 SetCL: 00001 RRR ..... RRR = c...c
637 SetR: 00010 RRR ..rrr RRR = rrr
638 SetA: 00011 RRR ..rrr RRR = array[rrr]
639 C.............C size of array = C...C
640 c.............c contents = c...c
642 Jump: 00100 000 c...c jump to c...c
643 JumpCond: 00101 RRR c...c if (!RRR) jump to c...c
644 WriteJump: 00110 RRR c...c Write1 RRR, jump to c...c
645 WriteReadJump: 00111 RRR c...c Write1, Read1 RRR, jump to c...c
646 WriteCJump: 01000 000 c...c Write1 C...C, jump to c...c
648 WriteCReadJump: 01001 RRR c...c Write1 C...C, Read1 RRR,
649 C.............C and jump to c...c
650 WriteSJump: 01010 000 c...c WriteS, jump to c...c
654 WriteSReadJump: 01011 RRR c...c WriteS, Read1 RRR, jump to c...c
658 WriteAReadJump: 01100 RRR c...c WriteA, Read1 RRR, jump to c...c
659 C.............C size of array = C...C
660 c.............c contents = c...c
662 Branch: 01101 RRR C...C if (RRR >= 0 && RRR < C..)
663 c.............c branch to (RRR+1)th address
664 Read1: 01110 RRR ... read 1-byte to RRR
665 Read2: 01111 RRR ..rrr read 2-byte to RRR and rrr
666 ReadBranch: 10000 RRR C...C Read1 and Branch
669 Write1: 10001 RRR ..... write 1-byte RRR
670 Write2: 10010 RRR ..rrr write 2-byte RRR and rrr
671 WriteC: 10011 000 ..... write 1-char C...CC
673 WriteS: 10100 000 ..... write C..-byte of string
677 WriteA: 10101 RRR ..... write array[RRR]
678 C.............C size of array = C...C
679 c.............c contents = c...c
681 End: 10110 000 ..... terminate the execution
683 SetSelfCS: 10111 RRR C...C RRR AAAAA= C...C
685 SetSelfCL: 11000 RRR ..... RRR AAAAA= c...c
688 SetSelfR: 11001 RRR ..Rrr RRR AAAAA= rrr
690 SetExprCL: 11010 RRR ..Rrr RRR = rrr AAAAA c...c
693 SetExprR: 11011 RRR ..rrr RRR = rrr AAAAA Rrr
696 JumpCondC: 11100 RRR c...c if !(RRR AAAAA C..) jump to c...c
699 JumpCondR: 11101 RRR c...c if !(RRR AAAAA rrr) jump to c...c
702 ReadJumpCondC: 11110 RRR c...c Read1 and JumpCondC
705 ReadJumpCondR: 11111 RRR c...c Read1 and JumpCondR
710 File: internals.info, Node: The Lisp Reader and Compiler, Next: Lstreams, Prev: MULE Character Sets and Encodings, Up: Top
712 The Lisp Reader and Compiler
713 ****************************
718 File: internals.info, Node: Lstreams, Next: Consoles; Devices; Frames; Windows, Prev: The Lisp Reader and Compiler, Up: Top
723 An "lstream" is an internal Lisp object that provides a generic
724 buffering stream implementation. Conceptually, you send data to the
725 stream or read data from the stream, not caring what's on the other end
726 of the stream. The other end could be another stream, a file
727 descriptor, a stdio stream, a fixed block of memory, a reallocating
728 block of memory, etc. The main purpose of the stream is to provide a
729 standard interface and to do buffering. Macros are defined to read or
730 write characters, so the calling functions do not have to worry about
731 blocking data together in order to achieve efficiency.
735 * Creating an Lstream:: Creating an lstream object.
736 * Lstream Types:: Different sorts of things that are streamed.
737 * Lstream Functions:: Functions for working with lstreams.
738 * Lstream Methods:: Creating new lstream types.
741 File: internals.info, Node: Creating an Lstream, Next: Lstream Types, Up: Lstreams
746 Lstreams come in different types, depending on what is being
747 interfaced to. Although the primitive for creating new lstreams is
748 `Lstream_new()', generally you do not call this directly. Instead, you
749 call some type-specific creation function, which creates the lstream
750 and initializes it as appropriate for the particular type.
752 All lstream creation functions take a MODE argument, specifying what
753 mode the lstream should be opened as. This controls whether the
754 lstream is for input and output, and optionally whether data should be
755 blocked up in units of MULE characters. Note that some types of
756 lstreams can only be opened for input; others only for output; and
757 others can be opened either way. #### Richard Mlynarik thinks that
758 there should be a strict separation between input and output streams,
759 and he's probably right.
761 MODE is a string, one of
770 Open for reading, but "read" never returns partial MULE characters.
773 Open for writing, but never writes partial MULE characters.
776 File: internals.info, Node: Lstream Types, Next: Lstream Functions, Prev: Creating an Lstream, Up: Lstreams
801 File: internals.info, Node: Lstream Functions, Next: Lstream Methods, Prev: Lstream Types, Up: Lstreams
806 - Function: Lstream * Lstream_new (Lstream_implementation *IMP, CONST
808 Allocate and return a new Lstream. This function is not really
809 meant to be called directly; rather, each stream type should
810 provide its own stream creation function, which creates the stream
811 and does any other necessary creation stuff (e.g. opening a file).
813 - Function: void Lstream_set_buffering (Lstream *LSTR,
814 Lstream_buffering BUFFERING, int BUFFERING_SIZE)
815 Change the buffering of a stream. See `lstream.h'. By default the
816 buffering is `STREAM_BLOCK_BUFFERED'.
818 - Function: int Lstream_flush (Lstream *LSTR)
819 Flush out any pending unwritten data in the stream. Clear any
820 buffered input data. Returns 0 on success, -1 on error.
822 - Macro: int Lstream_putc (Lstream *STREAM, int C)
823 Write out one byte to the stream. This is a macro and so it is
824 very efficient. The C argument is only evaluated once but the
825 STREAM argument is evaluated more than once. Returns 0 on
826 success, -1 on error.
828 - Macro: int Lstream_getc (Lstream *STREAM)
829 Read one byte from the stream. This is a macro and so it is very
830 efficient. The STREAM argument is evaluated more than once.
831 Return value is -1 for EOF or error.
833 - Macro: void Lstream_ungetc (Lstream *STREAM, int C)
834 Push one byte back onto the input queue. This will be the next
835 byte read from the stream. Any number of bytes can be pushed back
836 and will be read in the reverse order they were pushed back - most
837 recent first. (This is necessary for consistency - if there are a
838 number of bytes that have been unread and I read and unread a
839 byte, it needs to be the first to be read again.) This is a macro
840 and so it is very efficient. The C argument is only evaluated
841 once but the STREAM argument is evaluated more than once.
843 - Function: int Lstream_fputc (Lstream *STREAM, int C)
844 - Function: int Lstream_fgetc (Lstream *STREAM)
845 - Function: void Lstream_fungetc (Lstream *STREAM, int C)
846 Function equivalents of the above macros.
848 - Function: int Lstream_read (Lstream *STREAM, void *DATA, int SIZE)
849 Read SIZE bytes of DATA from the stream. Return the number of
850 bytes read. 0 means EOF. -1 means an error occurred and no bytes
853 - Function: int Lstream_write (Lstream *STREAM, void *DATA, int SIZE)
854 Write SIZE bytes of DATA to the stream. Return the number of
855 bytes written. -1 means an error occurred and no bytes were
858 - Function: void Lstream_unread (Lstream *STREAM, void *DATA, int SIZE)
859 Push back SIZE bytes of DATA onto the input queue. The next call
860 to `Lstream_read()' with the same size will read the same bytes
861 back. Note that this will be the case even if there is other
864 - Function: int Lstream_close (Lstream *STREAM)
865 Close the stream. All data will be flushed out.
867 - Function: void Lstream_reopen (Lstream *STREAM)
868 Reopen a closed stream. This enables I/O on it again. This is not
869 meant to be called except from a wrapper routine that reinitializes
870 variables and such - the close routine may well have freed some
871 necessary storage structures, for example.
873 - Function: void Lstream_rewind (Lstream *STREAM)
874 Rewind the stream to the beginning.
877 File: internals.info, Node: Lstream Methods, Prev: Lstream Functions, Up: Lstreams
882 - Lstream Method: int reader (Lstream *STREAM, unsigned char *DATA,
884 Read some data from the stream's end and store it into DATA, which
885 can hold SIZE bytes. Return the number of bytes read. A return
886 value of 0 means no bytes can be read at this time. This may be
887 because of an EOF, or because there is a granularity greater than
888 one byte that the stream imposes on the returned data, and SIZE is
889 less than this granularity. (This will happen frequently for
890 streams that need to return whole characters, because
891 `Lstream_read()' calls the reader function repeatedly until it has
892 the number of bytes it wants or until 0 is returned.) The lstream
893 functions do not treat a 0 return as EOF or do anything special;
894 however, the calling function will interpret any 0 it gets back as
895 EOF. This will normally not happen unless the caller calls
896 `Lstream_read()' with a very small size.
898 This function can be `NULL' if the stream is output-only.
900 - Lstream Method: int writer (Lstream *STREAM, CONST unsigned char
902 Send some data to the stream's end. Data to be sent is in DATA
903 and is SIZE bytes. Return the number of bytes sent. This
904 function can send and return fewer bytes than is passed in; in that
905 case, the function will just be called again until there is no
906 data left or 0 is returned. A return value of 0 means that no
907 more data can be currently stored, but there is no error; the data
908 will be squirreled away until the writer can accept data. (This is
909 useful, e.g., if you're dealing with a non-blocking file
910 descriptor and are getting `EWOULDBLOCK' errors.) This function
911 can be `NULL' if the stream is input-only.
913 - Lstream Method: int rewinder (Lstream *STREAM)
914 Rewind the stream. If this is `NULL', the stream is not seekable.
916 - Lstream Method: int seekable_p (Lstream *STREAM)
917 Indicate whether this stream is seekable - i.e. it can be rewound.
918 This method is ignored if the stream does not have a rewind
919 method. If this method is not present, the result is determined
920 by whether a rewind method is present.
922 - Lstream Method: int flusher (Lstream *STREAM)
923 Perform any additional operations necessary to flush the data in
926 - Lstream Method: int pseudo_closer (Lstream *STREAM)
928 - Lstream Method: int closer (Lstream *STREAM)
929 Perform any additional operations necessary to close this stream
930 down. May be `NULL'. This function is called when
931 `Lstream_close()' is called or when the stream is
932 garbage-collected. When this function is called, all pending data
933 in the stream will already have been written out.
935 - Lstream Method: Lisp_Object marker (Lisp_Object LSTREAM, void
936 (*MARKFUN) (Lisp_Object))
937 Mark this object for garbage collection. Same semantics as a
938 standard `Lisp_Object' marker. This function can be `NULL'.
941 File: internals.info, Node: Consoles; Devices; Frames; Windows, Next: The Redisplay Mechanism, Prev: Lstreams, Up: Top
943 Consoles; Devices; Frames; Windows
944 **********************************
948 * Introduction to Consoles; Devices; Frames; Windows::
951 * The Window Object::
954 File: internals.info, Node: Introduction to Consoles; Devices; Frames; Windows, Next: Point, Up: Consoles; Devices; Frames; Windows
956 Introduction to Consoles; Devices; Frames; Windows
957 ==================================================
959 A window-system window that you see on the screen is called a
960 "frame" in Emacs terminology. Each frame is subdivided into one or
961 more non-overlapping panes, called (confusingly) "windows". Each
962 window displays the text of a buffer in it. (See above on Buffers.) Note
963 that buffers and windows are independent entities: Two or more windows
964 can be displaying the same buffer (potentially in different locations),
965 and a buffer can be displayed in no windows.
967 A single display screen that contains one or more frames is called a
968 "display". Under most circumstances, there is only one display.
969 However, more than one display can exist, for example if you have a
970 "multi-headed" console, i.e. one with a single keyboard but multiple
971 displays. (Typically in such a situation, the various displays act like
972 one large display, in that the mouse is only in one of them at a time,
973 and moving the mouse off of one moves it into another.) In some cases,
974 the different displays will have different characteristics, e.g. one
977 XEmacs can display frames on multiple displays. It can even deal
978 simultaneously with frames on multiple keyboards (called "consoles" in
979 XEmacs terminology). Here is one case where this might be useful: You
980 are using XEmacs on your workstation at work, and leave it running.
981 Then you go home and dial in on a TTY line, and you can use the
982 already-running XEmacs process to display another frame on your local
985 Thus, there is a hierarchy console -> display -> frame -> window.
986 There is a separate Lisp object type for each of these four concepts.
987 Furthermore, there is logically a "selected console", "selected
988 display", "selected frame", and "selected window". Each of these
989 objects is distinguished in various ways, such as being the default
990 object for various functions that act on objects of that type. Note
991 that every containing object rememembers the "selected" object among
992 the objects that it contains: e.g. not only is there a selected window,
993 but every frame remembers the last window in it that was selected, and
994 changing the selected frame causes the remembered window within it to
995 become the selected window. Similar relationships apply for consoles
996 to devices and devices to frames.
999 File: internals.info, Node: Point, Next: Window Hierarchy, Prev: Introduction to Consoles; Devices; Frames; Windows, Up: Consoles; Devices; Frames; Windows
1004 Recall that every buffer has a current insertion position, called
1005 "point". Now, two or more windows may be displaying the same buffer,
1006 and the text cursor in the two windows (i.e. `point') can be in two
1007 different places. You may ask, how can that be, since each buffer has
1008 only one value of `point'? The answer is that each window also has a
1009 value of `point' that is squirreled away in it. There is only one
1010 selected window, and the value of "point" in that buffer corresponds to
1011 that window. When the selected window is changed from one window to
1012 another displaying the same buffer, the old value of `point' is stored
1013 into the old window's "point" and the value of `point' from the new
1014 window is retrieved and made the value of `point' in the buffer. This
1015 means that `window-point' for the selected window is potentially
1016 inaccurate, and if you want to retrieve the correct value of `point'
1017 for a window, you must special-case on the selected window and retrieve
1018 the buffer's point instead. This is related to why
1019 `save-window-excursion' does not save the selected window's value of
1023 File: internals.info, Node: Window Hierarchy, Next: The Window Object, Prev: Point, Up: Consoles; Devices; Frames; Windows
1028 If a frame contains multiple windows (panes), they are always created
1029 by splitting an existing window along the horizontal or vertical axis.
1030 Terminology is a bit confusing here: to "split a window horizontally"
1031 means to create two side-by-side windows, i.e. to make a *vertical* cut
1032 in a window. Likewise, to "split a window vertically" means to create
1033 two windows, one above the other, by making a *horizontal* cut.
1035 If you split a window and then split again along the same axis, you
1036 will end up with a number of panes all arranged along the same axis.
1037 The precise way in which the splits were made should not be important,
1038 and this is reflected internally. Internally, all windows are arranged
1039 in a tree, consisting of two types of windows, "combination" windows
1040 (which have children, and are covered completely by those children) and
1041 "leaf" windows, which have no children and are visible. Every
1042 combination window has two or more children, all arranged along the same
1043 axis. There are (logically) two subtypes of windows, depending on
1044 whether their children are horizontally or vertically arrayed. There is
1045 always one root window, which is either a leaf window (if the frame
1046 contains only one window) or a combination window (if the frame contains
1047 more than one window). In the latter case, the root window will have
1048 two or more children, either horizontally or vertically arrayed, and
1049 each of those children will be either a leaf window or another
1052 Here are some rules:
1054 1. Horizontal combination windows can never have children that are
1055 horizontal combination windows; same for vertical.
1057 2. Only leaf windows can be split (obviously) and this splitting does
1058 one of two things: (a) turns the leaf window into a combination
1059 window and creates two new leaf children, or (b) turns the leaf
1060 window into one of the two new leaves and creates the other leaf.
1061 Rule (1) dictates which of these two outcomes happens.
1063 3. Every combination window must have at least two children.
1065 4. Leaf windows can never become combination windows. They can be
1066 deleted, however. If this results in a violation of (3), the
1067 parent combination window also gets deleted.
1069 5. All functions that accept windows must be prepared to accept
1070 combination windows, and do something sane (e.g. signal an error
1071 if so). Combination windows *do* escape to the Lisp level.
1073 6. All windows have three fields governing their contents: these are
1074 "hchild" (a list of horizontally-arrayed children), "vchild" (a
1075 list of vertically-arrayed children), and "buffer" (the buffer
1076 contained in a leaf window). Exactly one of these will be
1077 non-nil. Remember that "horizontally-arrayed" means
1078 "side-by-side" and "vertically-arrayed" means "one above the
1081 7. Leaf windows also have markers in their `start' (the first buffer
1082 position displayed in the window) and `pointm' (the window's
1083 stashed value of `point' - see above) fields, while combination
1084 windows have nil in these fields.
1086 8. The list of children for a window is threaded through the `next'
1087 and `prev' fields of each child window.
1089 9. *Deleted windows can be undeleted*. This happens as a result of
1090 restoring a window configuration, and is unlike frames, displays,
1091 and consoles, which, once deleted, can never be restored.
1092 Deleting a window does nothing except set a special `dead' bit to
1093 1 and clear out the `next', `prev', `hchild', and `vchild' fields,
1096 10. Most frames actually have two top-level windows - one for the
1097 minibuffer and one (the "root") for everything else. The modeline
1098 (if present) separates these two. The `next' field of the root
1099 points to the minibuffer, and the `prev' field of the minibuffer
1100 points to the root. The other `next' and `prev' fields are `nil',
1101 and the frame points to both of these windows. Minibuffer-less
1102 frames have no minibuffer window, and the `next' and `prev' of the
1103 root window are `nil'. Minibuffer-only frames have no root
1104 window, and the `next' of the minibuffer window is `nil' but the
1105 `prev' points to itself. (#### This is an artifact that should be
1109 File: internals.info, Node: The Window Object, Prev: Window Hierarchy, Up: Consoles; Devices; Frames; Windows
1114 Windows have the following accessible fields:
1117 The frame that this window is on.
1120 Non-`nil' if this window is a minibuffer window.
1123 The buffer that the window is displaying. This may change often
1124 during the life of the window.
1127 Non-`nil' if this window is dedicated to its buffer.
1130 This is the value of point in the current buffer when this window
1131 is selected; when it is not selected, it retains its previous
1135 The position in the buffer that is the first character to be
1136 displayed in the window.
1139 If this flag is non-`nil', it says that the window has been
1140 scrolled explicitly by the Lisp program. This affects what the
1141 next redisplay does if point is off the screen: instead of
1142 scrolling the window to show the text around point, it moves point
1143 to a location that is on the screen.
1146 The `modified' field of the window's buffer, as of the last time a
1147 redisplay completed in this window.
1150 The buffer's value of point, as of the last time a redisplay
1151 completed in this window.
1154 This is the left-hand edge of the window, measured in columns.
1155 (The leftmost column on the screen is column 0.)
1158 This is the top edge of the window, measured in lines. (The top
1159 line on the screen is line 0.)
1162 The height of the window, measured in lines.
1165 The width of the window, measured in columns.
1168 This is the window that is the next in the chain of siblings. It
1169 is `nil' in a window that is the rightmost or bottommost of a
1173 This is the window that is the previous in the chain of siblings.
1174 It is `nil' in a window that is the leftmost or topmost of a group
1178 Internally, XEmacs arranges windows in a tree; each group of
1179 siblings has a parent window whose area includes all the siblings.
1180 This field points to a window's parent.
1182 Parent windows do not display buffers, and play little role in
1183 display except to shape their child windows. Emacs Lisp programs
1184 usually have no access to the parent windows; they operate on the
1185 windows at the leaves of the tree, which actually display buffers.
1188 This is the number of columns that the display in the window is
1189 scrolled horizontally to the left. Normally, this is 0.
1192 This is the last time that the window was selected. The function
1193 `get-lru-window' uses this field.
1196 The window's display table, or `nil' if none is specified for it.
1199 Non-`nil' means this window's mode line needs to be updated.
1202 The line number of a certain position in the buffer, or `nil'.
1203 This is used for displaying the line number of point in the mode
1207 The position in the buffer for which the line number is known, or
1208 `nil' meaning none is known.
1211 If the region (or part of it) is highlighted in this window, this
1212 field holds the mark position that made one end of that region.
1213 Otherwise, this field is `nil'.