1 This is ../info/lispref.info, produced by makeinfo version 4.6 from
4 INFO-DIR-SECTION XEmacs Editor
6 * Lispref: (lispref). XEmacs Lisp Reference Manual.
11 GNU Emacs Lisp Reference Manual Second Edition (v2.01), May 1993 GNU
12 Emacs Lisp Reference Manual Further Revised (v2.02), August 1993 Lucid
13 Emacs Lisp Reference Manual (for 19.10) First Edition, March 1994
14 XEmacs Lisp Programmer's Manual (for 19.12) Second Edition, April 1995
15 GNU Emacs Lisp Reference Manual v2.4, June 1995 XEmacs Lisp
16 Programmer's Manual (for 19.13) Third Edition, July 1995 XEmacs Lisp
17 Reference Manual (for 19.14 and 20.0) v3.1, March 1996 XEmacs Lisp
18 Reference Manual (for 19.15 and 20.1, 20.2, 20.3) v3.2, April, May,
19 November 1997 XEmacs Lisp Reference Manual (for 21.0) v3.3, April 1998
21 Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995 Free Software
22 Foundation, Inc. Copyright (C) 1994, 1995 Sun Microsystems, Inc.
23 Copyright (C) 1995, 1996 Ben Wing.
25 Permission is granted to make and distribute verbatim copies of this
26 manual provided the copyright notice and this permission notice are
27 preserved on all copies.
29 Permission is granted to copy and distribute modified versions of
30 this manual under the conditions for verbatim copying, provided that the
31 entire resulting derived work is distributed under the terms of a
32 permission notice identical to this one.
34 Permission is granted to copy and distribute translations of this
35 manual into another language, under the above conditions for modified
36 versions, except that this permission notice may be stated in a
37 translation approved by the Foundation.
39 Permission is granted to copy and distribute modified versions of
40 this manual under the conditions for verbatim copying, provided also
41 that the section entitled "GNU General Public License" is included
42 exactly as in the original, and provided that the entire resulting
43 derived work is distributed under the terms of a permission notice
44 identical to this one.
46 Permission is granted to copy and distribute translations of this
47 manual into another language, under the above conditions for modified
48 versions, except that the section entitled "GNU General Public License"
49 may be included in a translation approved by the Free Software
50 Foundation instead of in the original English.
53 File: lispref.info, Node: The LDAP Lisp Object, Next: Opening and Closing a LDAP Connection, Prev: The Low-Level LDAP API, Up: The Low-Level LDAP API
58 An internal built-in `ldap' lisp object represents a LDAP connection.
60 - Function: ldapp object
61 This function returns non-`nil' if OBJECT is a `ldap' object.
63 - Function: ldap-host ldap
64 Return the server host of the connection represented by LDAP.
66 - Function: ldap-live-p ldap
67 Return non-`nil' if LDAP is an active LDAP connection.
70 File: lispref.info, Node: Opening and Closing a LDAP Connection, Next: Low-level Operations on a LDAP Server, Prev: The LDAP Lisp Object, Up: The Low-Level LDAP API
72 Opening and Closing a LDAP Connection
73 .....................................
75 - Function: ldap-open host &optional plist
76 Open a LDAP connection to HOST. PLIST is a property list
77 containing additional parameters for the connection. Valid keys
80 The TCP port to use for the connection if different from
81 `ldap-default-port' or the library builtin value
84 The authentication method to use, possible values depend on
85 the LDAP library XEmacs was compiled with, they may include
86 `simple', `krbv41' and `krbv42'.
89 The distinguished name of the user to bind as. This may look
90 like `c=com, o=Acme, cn=Babs Jensen', see RFC 1779 for
94 The password to use for authentication.
97 The dereference policy is one of the symbols `never',
98 `always', `search' or `find' and defines how aliases are
101 Aliases are never dereferenced.
104 Aliases are always dereferenced.
107 Aliases are dereferenced when searching.
110 Aliases are dereferenced when locating the base object
112 The default is `never'.
115 The timeout limit for the connection in seconds.
118 The maximum number of matches to return for searches
119 performed on this connection.
121 - Function: ldap-close ldap
122 Close the connection represented by LDAP.
125 File: lispref.info, Node: Low-level Operations on a LDAP Server, Prev: Opening and Closing a LDAP Connection, Up: The Low-Level LDAP API
127 Low-level Operations on a LDAP Server
128 .....................................
130 `ldap-search-basic' is the low-level primitive to perform a search on a
131 LDAP server. It works directly on an open LDAP connection thus
132 requiring a preliminary call to `ldap-open'. Multiple searches can be
133 made on the same connection, then the session must be closed with
136 - Function: ldap-search-basic ldap filter &optional base scope attrs
137 attrsonly withdn verbose
138 Perform a search on an open connection LDAP created with
139 `ldap-open'. FILTER is a filter string for the search *note
140 Syntax of Search Filters:: BASE is the distinguished name at which
141 to start the search. SCOPE is one of the symbols `base',
142 `onelevel' or `subtree' indicating the scope of the search limited
143 to a base object, to a single level or to the whole subtree. The
144 default is `subtree'. ATTRS is a list of strings indicating which
145 attributes to retrieve for each matching entry. If `nil' all
146 available attributes are returned. If ATTRSONLY is non-`nil' then
147 only the attributes are retrieved, not their associated values.
148 If WITHDN is non-`nil' then each entry in the result is prepended
149 with its distinguished name DN. If VERBOSE is non-`nil' then
150 progress messages are echoed The function returns a list of
151 matching entries. Each entry is itself an alist of
152 attribute/value pairs optionally preceded by the DN of the entry
153 according to the value of WITHDN.
155 - Function: ldap-add ldap dn entry
156 Add ENTRY to a LDAP directory which a connection LDAP has been
157 opened to with `ldap-open'. DN is the distinguished name of the
158 entry to add. ENTRY is an entry specification, i.e., a list of
159 cons cells containing attribute/value string pairs.
161 - Function: ldap-modify ldap dn mods
162 Modify an entry in an LDAP directory. LDAP is an LDAP connection
163 object created with `ldap-open'. DN is the distinguished name of
164 the entry to modify. MODS is a list of modifications to apply. A
165 modification is a list of the form `(MOD-OP ATTR VALUE1 VALUE2
166 ...)' MOD-OP and ATTR are mandatory, VALUES are optional
167 depending on MOD-OP. MOD-OP is the type of modification, one of
168 the symbols `add', `delete' or `replace'. ATTR is the LDAP
169 attribute type to modify.
171 - Function: ldap-delete ldap dn
172 Delete an entry to an LDAP directory. LDAP is an LDAP connection
173 object created with `ldap-open'. DN is the distinguished name of
177 File: lispref.info, Node: LDAP Internationalization, Prev: The Low-Level LDAP API, Up: XEmacs LDAP API
179 LDAP Internationalization
180 -------------------------
182 The XEmacs LDAP API provides basic internationalization features based
183 on the LDAP v3 specification (essentially RFC2252 on "LDAP v3 Attribute
184 Syntax Definitions"). Unfortunately since there is currently no free
185 LDAP v3 server software, this part has not received much testing and
186 should be considered experimental. The framework is in place though.
188 - Function: ldap-decode-attribute attr
189 Decode the attribute/value pair ATTR according to LDAP rules. The
190 attribute name is looked up in `ldap-attribute-syntaxes-alist' and
191 the corresponding decoder is then retrieved from
192 `ldap-attribute-syntax-decoders'' and applied on the value(s).
196 * LDAP Internationalization Variables::
197 * Encoder/Decoder Functions::
200 File: lispref.info, Node: LDAP Internationalization Variables, Next: Encoder/Decoder Functions, Prev: LDAP Internationalization, Up: LDAP Internationalization
202 LDAP Internationalization Variables
203 ...................................
205 - Variable: ldap-ignore-attribute-codings
206 If non-`nil', no encoding/decoding will be performed LDAP
209 - Variable: ldap-coding-system
210 Coding system of LDAP string values. LDAP v3 specifies the coding
211 system of strings to be UTF-8. You need an XEmacs with Mule
214 - Variable: ldap-default-attribute-decoder
215 Decoder function to use for attributes whose syntax is unknown.
216 Such a function receives an encoded attribute value as a string
217 and should return the decoded value as a string.
219 - Variable: ldap-attribute-syntax-encoders
220 A vector of functions used to encode LDAP attribute values. The
221 sequence of functions corresponds to the sequence of LDAP
222 attribute syntax object identifiers of the form
223 1.3.6.1.4.1.1466.1115.121.1.* as defined in RFC2252 section 4.3.2.
224 As of this writing, only a few encoder functions are available.
226 - Variable: ldap-attribute-syntax-decoders
227 A vector of functions used to decode LDAP attribute values. The
228 sequence of functions corresponds to the sequence of LDAP
229 attribute syntax object identifiers of the form
230 1.3.6.1.4.1.1466.1115.121.1.* as defined in RFC2252 section 4.3.2.
231 As of this writing, only a few decoder functions are available.
233 - Variable: ldap-attribute-syntaxes-alist
234 A map of LDAP attribute names to their type object id minor number.
235 This table is built from RFC2252 Section 5 and RFC2256 Section 5.
238 File: lispref.info, Node: Encoder/Decoder Functions, Prev: LDAP Internationalization Variables, Up: LDAP Internationalization
240 Encoder/Decoder Functions
241 .........................
243 - Function: ldap-encode-boolean bool
244 A function that encodes an elisp boolean BOOL into a LDAP boolean
245 string representation.
247 - Function: ldap-decode-boolean str
248 A function that decodes a LDAP boolean string representation STR
249 into an elisp boolean.
251 - Function: ldap-decode-string str
252 Decode a string STR according to `ldap-coding-system'.
254 - Function: ldap-encode-string str
255 Encode a string STR according to `ldap-coding-system'.
257 - Function: ldap-decode-address str
258 Decode an address STR according to `ldap-coding-system' and
259 replacing $ signs with newlines as specified by LDAP encoding
262 - Function: ldap-encode-address str
263 Encode an address STR according to `ldap-coding-system' and
264 replacing newlines with $ signs as specified by LDAP encoding
268 File: lispref.info, Node: Syntax of Search Filters, Prev: XEmacs LDAP API, Up: LDAP Support
270 Syntax of Search Filters
271 ========================
273 LDAP search functions use RFC1558 syntax to describe the search filter.
274 In that syntax simple filters have the form:
276 (<attr> <filtertype> <value>)
278 `<attr>' is an attribute name such as `cn' for Common Name, `o' for
281 `<value>' is the corresponding value. This is generally an exact
282 string but may also contain `*' characters as wildcards
284 `filtertype' is one `=' `~=', `<=', `>=' which respectively describe
285 equality, approximate equality, inferiority and superiority.
287 Thus `(cn=John Smith)' matches all records having a canonical name
290 A special case is the presence filter `(<attr>=*' which matches
291 records containing a particular attribute. For instance `(mail=*)'
292 matches all records containing a `mail' attribute.
294 Simple filters can be connected together with the logical operators
295 `&', `|' and `!' which stand for the usual and, or and not operators.
297 `(&(objectClass=Person)(mail=*)(|(sn=Smith)(givenname=John)))'
298 matches records of class `Person' containing a `mail' attribute and
299 corresponding to people whose last name is `Smith' or whose first name
303 File: lispref.info, Node: PostgreSQL Support, Next: Internationalization, Prev: LDAP Support, Up: Top
308 XEmacs can be linked with PostgreSQL libpq run-time support to provide
309 relational database access from Emacs Lisp code.
313 * Building XEmacs with PostgreSQL support::
314 * XEmacs PostgreSQL libpq API::
315 * XEmacs PostgreSQL libpq Examples::
318 File: lispref.info, Node: Building XEmacs with PostgreSQL support, Next: XEmacs PostgreSQL libpq API, Up: PostgreSQL Support
320 Building XEmacs with PostgreSQL support
321 =======================================
323 XEmacs PostgreSQL support requires linking to the PostgreSQL libpq
324 library. Describing how to build and install PostgreSQL is beyond the
325 scope of this document. See the PostgreSQL manual for details.
327 If you have installed XEmacs from one of the binary kits on
328 (<ftp://ftp.xemacs.org/>), or are using an XEmacs binary from a CD ROM,
329 you may have XEmacs PostgreSQL support by default. `M-x
330 describe-installation' will tell you if you do.
332 If you are building XEmacs from source, you need to install
333 PostgreSQL first. On some systems, PostgreSQL will come pre-installed
334 in /usr. In this case, it should be autodetected when you run
335 configure. If PostgreSQL is installed into its default location,
336 `/usr/local/pgsql', you must specify `--site-prefixes=/usr/local/pgsql'
337 when you run configure. If PostgreSQL is installed into another
338 location, use that instead of `/usr/local/pgsql' when specifying
341 As of XEmacs 21.2, PostgreSQL versions 6.5.3 and 7.0 are supported.
342 XEmacs Lisp support for V7.0 is somewhat more extensive than support for
343 V6.5. In particular, asynchronous queries are supported.
346 File: lispref.info, Node: XEmacs PostgreSQL libpq API, Next: XEmacs PostgreSQL libpq Examples, Prev: Building XEmacs with PostgreSQL support, Up: PostgreSQL Support
348 XEmacs PostgreSQL libpq API
349 ===========================
351 The XEmacs PostgreSQL API is intended to be a policy-free, low-level
352 binding to libpq. The intent is to provide all the basic functionality
353 and then let high level Lisp code decide its own policies.
355 This documentation assumes that the reader has knowledge of SQL, but
356 requires no prior knowledge of libpq.
358 There are many examples in this manual and some setup will be
359 required. In order to run most of the following examples, the
360 following code needs to be executed. In addition to the data is in
361 this table, nearly all of the examples will assume that the free
362 variable `P' refers to this database connection. The examples in the
363 original edition of this manual were run against Postgres 7.0beta1.
366 (setq P (pq-connectdb ""))
367 ;; id is the primary key, shikona is a Japanese word that
368 ;; means `the professional name of a Sumo wrestler', and
369 ;; rank is the Sumo rank name.
370 (pq-exec P (concat "CREATE TABLE xemacs_test"
371 " (id int, shikona text, rank text);"))
372 (pq-exec P "COPY xemacs_test FROM stdin;")
373 (pq-put-line P "1\tMusashimaru\tYokuzuna\n")
374 (pq-put-line P "2\tDejima\tOozeki\n")
375 (pq-put-line P "3\tMusoyama\tSekiwake\n")
376 (pq-put-line P "4\tMiyabiyama\tSekiwake\n")
377 (pq-put-line P "5\tWakanoyama\tMaegashira\n")
378 (pq-put-line P "\\.\n")
384 * libpq Lisp Variables::
385 * libpq Lisp Symbols and DataTypes::
386 * Synchronous Interface Functions::
387 * Asynchronous Interface Functions::
388 * Large Object Support::
389 * Other libpq Functions::
390 * Unimplemented libpq Functions::
393 File: lispref.info, Node: libpq Lisp Variables, Next: libpq Lisp Symbols and DataTypes, Prev: XEmacs PostgreSQL libpq API, Up: XEmacs PostgreSQL libpq API
398 Various Unix environment variables are used by libpq to provide defaults
399 to the many different parameters. In the XEmacs Lisp API, these
400 environment variables are bound to Lisp variables to provide more
401 convenient access to Lisp Code. These variables are passed to the
402 backend database server during the establishment of a database
403 connection and when the `pq-setenv' call is made.
406 Initialized from the `PGHOST' environment variable. The default
410 Initialized from the `PGUSER' environment variable. The default
413 - Variable: pg:options
414 Initialized from the `PGOPTIONS' environment variable. Default
415 additional server options.
418 Initialized from the `PGPORT' environment variable. The default
419 TCP port to connect to.
422 Initialized from the `PGTTY' environment variable. The default
425 Compatibility note: Debugging TTYs are turned off in the XEmacs
428 - Variable: pg:database
429 Initialized from the `PGDATABASE' environment variable. The
430 default database to connect to.
433 Initialized from the `PGREALM' environment variable. The default
436 - Variable: pg:client-encoding
437 Initialized from the `PGCLIENTENCODING' environment variable. The
438 default client encoding.
440 Compatibility note: This variable is not present in non-Mule
441 XEmacsen. This variable is not present in versions of libpq prior
442 to 7.0. In the current implementation, client encoding is
443 equivalent to the `file-name-coding-system' format.
445 - Variable: pg:authtype
446 Initialized from the `PGAUTHTYPE' environment variable. The
447 default authentication scheme used.
449 Compatibility note: This variable is unused in versions of libpq
450 after 6.5. It is not implemented at all in the XEmacs Lisp
454 Initialized from the `PGGEQO' environment variable. Genetic
457 - Variable: pg:cost-index
458 Initialized from the `PGCOSTINDEX' environment variable. Cost
461 - Variable: pg:cost-heap
462 Initialized from the `PGCOSTHEAP' environment variable. Cost heap
466 Initialized from the `PGTZ' environment variable. Default
469 - Variable: pg:date-style
470 Initialized from the `PGDATESTYLE' environment variable. Default
471 date style in returned date objects.
473 - Variable: pg-coding-system
474 This is a variable controlling which coding system is used to
475 encode non-ASCII strings sent to the database.
477 Compatibility Note: This variable is not present in InfoDock.
480 File: lispref.info, Node: libpq Lisp Symbols and DataTypes, Next: Synchronous Interface Functions, Prev: libpq Lisp Variables, Up: XEmacs PostgreSQL libpq API
482 libpq Lisp Symbols and Datatypes
483 --------------------------------
485 The following set of symbols are used to represent the intermediate
486 states involved in the asynchronous interface.
488 - Symbol: pgres::polling-failed
489 Undocumented. A fatal error has occurred during processing of an
490 asynchronous operation.
492 - Symbol: pgres::polling-reading
493 An intermediate status return during an asynchronous operation. It
494 indicates that one may use `select' before polling again.
496 - Symbol: pgres::polling-writing
497 An intermediate status return during an asynchronous operation. It
498 indicates that one may use `select' before polling again.
500 - Symbol: pgres::polling-ok
501 An asynchronous operation has successfully completed.
503 - Symbol: pgres::polling-active
504 An intermediate status return during an asynchronous operation.
505 One can call the poll function again immediately.
507 - Function: pq-pgconn conn field
508 CONN A database connection object. FIELD A symbol indicating
509 which field of PGconn to fetch. Possible values are shown in the
518 Database user's password
521 Hostname database server is running on
524 TCP port number used in the connection
529 Compatibility note: Debugging TTYs are not used in the
533 Additional server options
536 Connection status. Possible return values are shown in the
539 The normal, connected status.
542 The connection is not open and the PGconn object needs
543 to be deleted by `pq-finish'.
545 `pg::connection-started'
546 An asynchronous connection has been started, but is not
549 `pg::connection-made'
550 An asynchronous connect has been made, and there is data
553 `pg::connection-awaiting-response'
554 Awaiting data from the backend during an asynchronous
557 `pg::connection-auth-ok'
558 Received authentication, waiting for the backend to
561 `pg::connection-setenv'
562 Negotiating environment during an asynchronous
566 The last error message that was delivered to this connection.
569 The process ID of the backend database server.
571 The `PGresult' object is used by libpq to encapsulate the results of
572 queries. The printed representation takes on four forms. When the
573 PGresult object contains tuples from an SQL `SELECT' it will look like:
575 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
576 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
578 The number in brackets indicates how many rows of data are available.
579 When the PGresult object is the result of a command query that doesn't
580 return anything, it will look like:
582 (pq-exec P "CREATE TABLE a_new_table (i int);")
583 => #<PGresult PGRES_COMMAND_OK - CREATE>
585 When either the query is a command-type query that can affect a
586 number of different rows, but doesn't return any of them it will look
590 (pq-exec P "INSERT INTO a_new_table VALUES (1);")
591 (pq-exec P "INSERT INTO a_new_table VALUES (2);")
592 (pq-exec P "INSERT INTO a_new_table VALUES (3);")
593 (setq R (pq-exec P "DELETE FROM a_new_table;")))
594 => #<PGresult PGRES_COMMAND_OK[3] - DELETE 3>
596 Lastly, when the underlying PGresult object has been deallocated
597 directly by `pq-clear' the printed representation will look like:
600 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
605 The following set of functions are accessors to various data in the
608 - Function: pq-result-status result
609 Return status of a query result. RESULT is a PGresult object.
610 The return value is one of the symbols in the following table.
612 A query contained no text. This is usually the result of a
613 recoverable error, or a minor programming error.
616 A query command that doesn't return anything was executed
617 properly by the backend.
620 A query command that returns tuples was executed properly by
624 Copy Out data transfer is in progress.
627 Copy In data transfer is in progress.
629 `pgres::bad-response'
630 An unexpected response was received from the backend.
632 `pgres::nonfatal-error'
633 Undocumented. This value is returned when the libpq function
634 `PQresultStatus' is called with a `NULL' pointer.
637 Undocumented. An error has occurred in processing the query
638 and the operation was not completed.
640 - Function: pq-res-status result
641 Return the query result status as a string, not a symbol. RESULT
642 is a PGresult object.
644 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
645 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
649 - Function: pq-result-error-message result
650 Return an error message generated by the query, if any. RESULT is
653 (setq R (pq-exec P "SELECT * FROM xemacs-test;"))
654 => <A fatal error is signaled in the echo area>
655 (pq-result-error-message R)
656 => "ERROR: parser: parse error at or near \"-\"
659 - Function: pq-ntuples result
660 Return the number of tuples in the query result. RESULT is a
663 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
664 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
668 - Function: pq-nfields result
669 Return the number of fields in each tuple of the query result.
670 RESULT is a PGresult object.
672 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
673 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
677 - Function: pq-binary-tuples result
678 Returns t if binary tuples are present in the results, nil
679 otherwise. RESULT is a PGresult object.
681 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
682 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
686 - Function: pq-fname result field-index
687 Returns the name of a specific field. RESULT is a PGresult object.
688 FIELD-INDEX is the number of the column to select from. The first
689 column is number zero.
692 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
693 (setq i (pq-nfields R))
694 (while (>= (decf i) 0)
695 (push (pq-fname R i) l))
697 => ("id" "shikona" "rank")
699 - Function: pq-fnumber result field-name
700 Return the field number corresponding to the given field name. -1
701 is returned on a bad field name. RESULT is a PGresult object.
702 FIELD-NAME is a string representing the field name to find.
703 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
704 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
707 (pq-fnumber R "Not a field")
710 - Function: pq-ftype result field-num
711 Return an integer code representing the data type of the specified
712 column. RESULT is a PGresult object. FIELD-NUM is the field
715 The return value of this function is the Object ID (Oid) in the
716 database of the type. Further queries need to be made to various
717 system tables in order to convert this value into something useful.
719 - Function: pq-fmod result field-num
720 Return the type modifier code associated with a field. Field
721 numbers start at zero. RESULT is a PGresult object. FIELD-INDEX
722 selects which field to use.
724 - Function: pq-fsize result field-index
725 Return size of the given field. RESULT is a PGresult object.
726 FIELD-INDEX selects which field to use.
729 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
730 (setq i (pq-nfields R))
731 (while (>= (decf i) 0)
732 (push (list (pq-ftype R i) (pq-fsize R i)) l))
734 => ((23 23) (25 25) (25 25))
736 - Function: pq-get-value result tup-num field-num
737 Retrieve a return value. RESULT is a PGresult object. TUP-NUM
738 selects which tuple to fetch from. FIELD-NUM selects which field
741 Both tuples and fields are numbered from zero.
743 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
744 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
752 - Function: pq-get-length result tup-num field-num
753 Return the length of a specific value. RESULT is a PGresult
754 object. TUP-NUM selects which tuple to fetch from. FIELD-NUM
755 selects which field to fetch from.
757 (setq R (pq-exec P "SELECT * FROM xemacs_test;"))
758 => #<PGresult PGRES_TUPLES_OK[5] - SELECT>
759 (pq-get-length R 0 1)
761 (pq-get-length R 1 1)
763 (pq-get-length R 2 1)
766 - Function: pq-get-is-null result tup-num field-num
767 Return t if the specific value is the SQL `NULL'. RESULT is a
768 PGresult object. TUP-NUM selects which tuple to fetch from.
769 FIELD-NUM selects which field to fetch from.
771 - Function: pq-cmd-status result
772 Return a summary string from the query. RESULT is a PGresult
774 (setq R (pq-exec P "INSERT INTO xemacs_test
775 VALUES (6, 'Wakanohana', 'Yokozuna');"))
776 => #<PGresult PGRES_COMMAND_OK[1] - INSERT 542086 1>
779 (setq R (pq-exec P "UPDATE xemacs_test SET rank='retired'
780 WHERE shikona='Wakanohana';"))
781 => #<PGresult PGRES_COMMAND_OK[1] - UPDATE 1>
785 Note that the first number returned from an insertion, like in the
786 example, is an object ID number and will almost certainly vary from
787 system to system since object ID numbers in Postgres must be unique
788 across all databases.
790 - Function: pq-cmd-tuples result
791 Return the number of tuples if the last command was an
792 INSERT/UPDATE/DELETE. If the last command was something else, the
793 empty string is returned. RESULT is a PGresult object.
795 (setq R (pq-exec P "INSERT INTO xemacs_test VALUES
796 (7, 'Takanohana', 'Yokuzuna');"))
797 => #<PGresult PGRES_COMMAND_OK[1] - INSERT 38688 1>
800 (setq R (pq-exec P "SELECT * from xemacs_test;"))
801 => #<PGresult PGRES_TUPLES_OK[7] - SELECT>
804 (setq R (pq-exec P "DELETE FROM xemacs_test
805 WHERE shikona LIKE '%hana';"))
806 => #<PGresult PGRES_COMMAND_OK[2] - DELETE 2>
810 - Function: pq-oid-value result
811 Return the object id of the insertion if the last command was an
812 INSERT. 0 is returned if the last command was not an insertion.
813 RESULT is a PGresult object.
815 In the first example, the numbers you will see on your local
816 system will almost certainly be different, however the second
817 number from the right in the unprintable PGresult object and the
818 number returned by `pq-oid-value' should match.
819 (setq R (pq-exec P "INSERT INTO xemacs_test VALUES
820 (8, 'Terao', 'Maegashira');"))
821 => #<PGresult PGRES_COMMAND_OK[1] - INSERT 542089 1>
824 (setq R (pq-exec P "SELECT shikona FROM xemacs_test
825 WHERE rank='Maegashira';"))
826 => #<PGresult PGRES_TUPLES_OK[2] - SELECT>
830 - Function: pq-make-empty-pgresult conn status
831 Create an empty pgresult with the given status. CONN a database
832 connection object STATUS a value that can be returned by
835 The caller is responsible for making sure the return value gets
839 File: lispref.info, Node: Synchronous Interface Functions, Next: Asynchronous Interface Functions, Prev: libpq Lisp Symbols and DataTypes, Up: XEmacs PostgreSQL libpq API
841 Synchronous Interface Functions
842 -------------------------------
844 - Function: pq-connectdb conninfo
845 Establish a (synchronous) database connection. CONNINFO A string
846 of blank separated options. Options are of the form "OPTION =
847 VALUE". If VALUE contains blanks, it must be single quoted.
848 Blanks around the equal sign are optional. Multiple option
849 assignments are blank separated.
850 (pq-connectdb "dbname=japanese port = 25432")
851 => #<PGconn localhost:25432 steve/japanese>
852 The printed representation of a database connection object has four
853 fields. The first field is the hostname where the database server
854 is running (in this case localhost), the second field is the port
855 number, the third field is the database user name, and the fourth
856 field is the name of the database.
858 Database connection objects which have been disconnected and will
859 generate an immediate error if they are used look like:
861 Bad connections can be reestablished with `pq-reset', or deleted
862 entirely with `pq-finish'.
864 A database connection object that has been deleted looks like:
865 (let ((P1 (pq-connectdb "")))
870 Note that database connection objects are the most heavy weight
871 objects in XEmacs Lisp at this writing, usually representing as
872 much as several megabytes of virtual memory on the machine the
873 database server is running on. It is wisest to explicitly delete
874 them when you are finished with them, rather than letting garbage
875 collection do it. An example idiom is:
877 (let ((P (pq-connectiondb "")))
880 (...)) ; access database here
883 The following options are available in the options string:
885 Authentication type. Same as `PGAUTHTYPE'. This is no
889 Database user name. Same as `PGUSER'.
895 Database name. Same as `PGDATABASE'
898 Symbolic hostname. Same as `PGHOST'.
901 Host address as four octets (eg. like 192.168.1.1).
904 TCP port to connect to. Same as `PGPORT'.
907 Debugging TTY. Same as `PGTTY'. This value is suppressed in
911 Extra backend database options. Same as `PGOPTIONS'.
912 A database connection object is returned regardless of whether a
913 connection was established or not.
915 - Function: pq-reset conn
916 Reestablish database connection. CONN A database connection
919 This function reestablishes a database connection using the
920 original connection parameters. This is useful if something has
921 happened to the TCP link and it has become broken.
923 - Function: pq-exec conn query
924 Make a synchronous database query. CONN A database connection
925 object. QUERY A string containing an SQL query. A PGresult
926 object is returned, which in turn may be queried by its many
927 accessor functions to retrieve state out of it. If the query
928 string contains multiple SQL commands, only results from the final
929 command are returned.
931 (setq R (pq-exec P "SELECT * FROM xemacs_test;
932 DELETE FROM xemacs_test WHERE id=8;"))
933 => #<PGresult PGRES_COMMAND_OK[1] - DELETE 1>
935 - Function: pq-notifies conn
936 Return the latest async notification that has not yet been handled.
937 CONN A database connection object. If there has been a
938 notification, then a list of two elements will be returned. The
939 first element contains the relation name being notified, the second
940 element contains the backend process ID number. nil is returned
941 if there aren't any notifications to process.
943 - Function: PQsetenv conn
944 Synchronous transfer of environment variables to a backend CONN A
945 database connection object.
947 Environment variable transfer is done as a normal part of database
950 Compatibility note: This function was present but not documented
951 in versions of libpq prior to 7.0.
954 File: lispref.info, Node: Asynchronous Interface Functions, Next: Large Object Support, Prev: Synchronous Interface Functions, Up: XEmacs PostgreSQL libpq API
956 Asynchronous Interface Functions
957 --------------------------------
959 Making command by command examples is too complex with the asynchronous
960 interface functions. See the examples section for complete calling
963 - Function: pq-connect-start conninfo
964 Begin establishing an asynchronous database connection. CONNINFO
965 A string containing the connection options. See the documentation
966 of `pq-connectdb' for a listing of all the available flags.
968 - Function: pq-connect-poll conn
969 An intermediate function to be called during an asynchronous
970 database connection. CONN A database connection object. The
971 result codes are documented in a previous section.
973 - Function: pq-is-busy conn
974 Returns t if `pq-get-result' would block waiting for input. CONN
975 A database connection object.
977 - Function: pq-consume-input conn
978 Consume any available input from the backend. CONN A database
981 Nil is returned if anything bad happens.
983 - Function: pq-reset-start conn
984 Reset connection to the backend asynchronously. CONN A database
987 - Function: pq-reset-poll conn
988 Poll an asynchronous reset for completion CONN A database
991 - Function: pq-reset-cancel conn
992 Attempt to request cancellation of the current operation. CONN A
993 database connection object.
995 The return value is t if the cancel request was successfully
996 dispatched, nil if not (in which case conn->errorMessage is set).
997 Note: successful dispatch is no guarantee that there will be any
998 effect at the backend. The application must read the operation
1001 - Function: pq-send-query conn query
1002 Submit a query to Postgres and don't wait for the result. CONN A
1003 database connection object. Returns: t if successfully submitted
1004 nil if error (conn->errorMessage is set)
1006 - Function: pq-get-result conn
1007 Retrieve an asynchronous result from a query. CONN A database
1010 `nil' is returned when no more query work remains.
1012 - Function: pq-set-nonblocking conn arg
1013 Sets the PGconn's database connection non-blocking if the arg is
1014 TRUE or makes it non-blocking if the arg is FALSE, this will not
1015 protect you from PQexec(), you'll only be safe when using the
1016 non-blocking API. CONN A database connection object.
1018 - Function: pq-is-nonblocking conn
1019 Return the blocking status of the database connection CONN A
1020 database connection object.
1022 - Function: pq-flush conn
1023 Force the write buffer to be written (or at least try) CONN A
1024 database connection object.
1026 - Function: PQsetenvStart conn
1027 Start asynchronously passing environment variables to a backend.
1028 CONN A database connection object.
1030 Compatibility note: this function is only available with libpq-7.0.
1032 - Function: PQsetenvPoll conn
1033 Check an asynchronous environment variables transfer for
1034 completion. CONN A database connection object.
1036 Compatibility note: this function is only available with libpq-7.0.
1038 - Function: PQsetenvAbort conn
1039 Attempt to terminate an asynchronous environment variables
1040 transfer. CONN A database connection object.
1042 Compatibility note: this function is only available with libpq-7.0.
1045 File: lispref.info, Node: Large Object Support, Next: Other libpq Functions, Prev: Asynchronous Interface Functions, Up: XEmacs PostgreSQL libpq API
1047 Large Object Support
1048 --------------------
1050 - Function: pq-lo-import conn filename
1051 Import a file as a large object into the database. CONN a
1052 database connection object FILENAME filename to import
1054 On success, the object id is returned.
1056 - Function: pq-lo-export conn oid filename
1057 Copy a large object in the database into a file. CONN a database
1058 connection object. OID object id number of a large object.
1059 FILENAME filename to export to.
1062 File: lispref.info, Node: Other libpq Functions, Next: Unimplemented libpq Functions, Prev: Large Object Support, Up: XEmacs PostgreSQL libpq API
1064 Other libpq Functions
1065 ---------------------
1067 - Function: pq-finish conn
1068 Destroy a database connection object by calling free on it. CONN
1069 a database connection object
1071 It is possible to not call this routine because the usual XEmacs
1072 garbage collection mechanism will call the underlying libpq
1073 routine whenever it is releasing stale `PGconn' objects. However,
1074 this routine is useful in `unwind-protect' clauses to make
1075 connections go away quickly when unrecoverable errors have
1078 After calling this routine, the printed representation of the
1079 XEmacs wrapper object will contain the string "DEAD".
1081 - Function: pq-client-encoding conn
1082 Return the client encoding as an integer code. CONN a database
1085 (pq-client-encoding P)
1088 Compatibility note: This function did not exist prior to libpq-7.0
1089 and does not exist in a non-Mule XEmacs.
1091 - Function: pq-set-client-encoding conn encoding
1092 Set client coding system. CONN a database connection object
1093 ENCODING a string representing the desired coding system
1095 (pq-set-client-encoding P "EUC_JP")
1098 The current idiom for ensuring proper coding system conversion is
1099 the following (illustrated for EUC Japanese encoding):
1100 (setq P (pq-connectdb "..."))
1101 (let ((file-name-coding-system 'euc-jp)
1102 (pg-coding-system 'euc-jp))
1103 (pq-set-client-encoding "EUC_JP")
1106 Compatibility note: This function did not exist prior to libpq-7.0
1107 and does not exist in a non-Mule XEmacs.
1109 - Function: pq-env-2-encoding
1110 Return the integer code representing the coding system in
1115 Compatibility note: This function did not exist prior to libpq-7.0
1116 and does not exist in a non-Mule XEmacs.
1118 - Function: pq-clear res
1119 Destroy a query result object by calling free() on it. RES a
1122 Note: The memory allocation systems of libpq and XEmacs are
1123 different. The XEmacs representation of a query result object
1124 will have both the XEmacs version and the libpq version freed at
1125 the next garbage collection when the object is no longer being
1126 referenced. Calling this function does not release the XEmacs
1127 object, it is still subject to the usual rules for Lisp objects.
1128 The printed representation of the XEmacs object will contain the
1129 string "DEAD" after this routine is called indicating that it is no
1130 longer useful for anything.
1132 - Function: pq-conn-defaults
1133 Return a data structure that represents the connection defaults.
1134 The data is returned as a list of lists, where each sublist
1135 contains info regarding a single option.
1138 File: lispref.info, Node: Unimplemented libpq Functions, Prev: Other libpq Functions, Up: XEmacs PostgreSQL libpq API
1140 Unimplemented libpq Functions
1141 -----------------------------
1143 - Unimplemented Function: PGconn *PQsetdbLogin (char *pghost, char
1144 *pgport, char *pgoptions, char *pgtty, char *dbName, char
1146 Synchronous database connection. PGHOST is the hostname of the
1147 PostgreSQL backend to connect to. PGPORT is the TCP port number
1148 to use. PGOPTIONS specifies other backend options. PGTTY
1149 specifies the debugging tty to use. DBNAME specifies the database
1150 name to use. LOGIN specifies the database user name. PWD
1151 specifies the database user's password.
1153 This routine is deprecated as of libpq-7.0, and its functionality
1154 can be replaced by external Lisp code if needed.
1156 - Unimplemented Function: PGconn *PQsetdb (char *pghost, char *pgport,
1157 char *pgoptions, char *pgtty, char *dbName)
1158 Synchronous database connection. PGHOST is the hostname of the
1159 PostgreSQL backend to connect to. PGPORT is the TCP port number
1160 to use. PGOPTIONS specifies other backend options. PGTTY
1161 specifies the debugging tty to use. DBNAME specifies the database
1164 This routine was deprecated in libpq-6.5.
1166 - Unimplemented Function: int PQsocket (PGconn *conn)
1167 Return socket file descriptor to a backend database process. CONN
1168 database connection object.
1170 - Unimplemented Function: void PQprint (FILE *fout, PGresult *res,
1172 Print out the results of a query to a designated C stream. FOUT C
1173 stream to print to RES the query result object to print PS the
1174 print options structure.
1176 This routine is deprecated as of libpq-7.0 and cannot be sensibly
1177 exported to XEmacs Lisp.
1179 - Unimplemented Function: void PQdisplayTuples (PGresult *res, FILE
1180 *fp, int fillAlign, char *fieldSep, int printHeader, int
1182 RES query result object to print FP C stream to print to FILLALIGN
1183 pad the fields with spaces FIELDSEP field separator PRINTHEADER
1184 display headers? QUIET
1186 This routine was deprecated in libpq-6.5.
1188 - Unimplemented Function: void PQprintTuples (PGresult *res, FILE
1189 *fout, int printAttName, int terseOutput, int width)
1190 RES query result object to print FOUT C stream to print to
1191 PRINTATTNAME print attribute names TERSEOUTPUT delimiter bars
1192 WIDTH width of column, if 0, use variable width
1194 This routine was deprecated in libpq-6.5.
1196 - Unimplemented Function: int PQmblen (char *s, int encoding)
1197 Determine length of a multibyte encoded char at `*s'. S encoded
1198 string ENCODING type of encoding
1200 Compatibility note: This function was introduced in libpq-7.0.
1202 - Unimplemented Function: void PQtrace (PGconn *conn, FILE *debug_port)
1203 Enable tracing on `debug_port'. CONN database connection object.
1204 DEBUG_PORT C output stream to use.
1206 - Unimplemented Function: void PQuntrace (PGconn *conn)
1207 Disable tracing. CONN database connection object.
1209 - Unimplemented Function: char *PQoidStatus (PGconn *conn)
1210 Return the object id as a string of the last tuple inserted. CONN
1211 database connection object.
1213 Compatibility note: This function is deprecated in libpq-7.0,
1214 however it is used internally by the XEmacs binding code when
1215 linked against versions prior to 7.0.
1217 - Unimplemented Function: PGresult *PQfn (PGconn *conn, int fnid, int
1218 *result_buf, int *result_len, int result_is_int, PQArgBlock
1220 "Fast path" interface -- not really recommended for application use
1221 CONN A database connection object. FNID RESULT_BUF RESULT_LEN
1222 RESULT_IS_INT ARGS NARGS
1224 The following set of very low level large object functions aren't
1225 appropriate to be exported to Lisp.
1227 - Unimplemented Function: int pq-lo-open (PGconn *conn, int lobjid,
1229 CONN a database connection object. LOBJID a large object ID.
1232 - Unimplemented Function: int pq-lo-close (PGconn *conn, int fd)
1233 CONN a database connection object. FD a large object file
1236 - Unimplemented Function: int pq-lo-read (PGconn *conn, int fd, char
1238 CONN a database connection object. FD a large object file
1239 descriptor. BUF buffer to read into. LEN size of buffer.
1241 - Unimplemented Function: int pq-lo-write (PGconn *conn, int fd, char
1243 CONN a database connection object. FD a large object file
1244 descriptor. BUF buffer to write from. LEN size of buffer.
1246 - Unimplemented Function: int pq-lo-lseek (PGconn *conn, int fd, int
1248 CONN a database connection object. FD a large object file
1249 descriptor. OFFSET WHENCE
1251 - Unimplemented Function: int pq-lo-creat (PGconn *conn, int mode)
1252 CONN a database connection object. MODE opening modes.
1254 - Unimplemented Function: int pq-lo-tell (PGconn *conn, int fd)
1255 CONN a database connection object. FD a large object file
1258 - Unimplemented Function: int pq-lo-unlink (PGconn *conn, int lobjid)
1259 CONN a database connection object. LBOJID a large object ID.
1262 File: lispref.info, Node: XEmacs PostgreSQL libpq Examples, Prev: XEmacs PostgreSQL libpq API, Up: PostgreSQL Support
1264 XEmacs PostgreSQL libpq Examples
1265 ================================
1267 This is an example of one method of establishing an asynchronous
1270 (defun database-poller (P)
1271 (message "%S before poll" (pq-pgconn P 'pq::status))
1273 (message "%S after poll" (pq-pgconn P 'pq::status))
1274 (if (eq (pq-pgconn P 'pq::status) 'pg::connection-ok)
1276 (add-timeout .1 'database-poller P)))
1279 (setq P (pq-connect-start ""))
1280 (add-timeout .1 'database-poller P))
1281 => pg::connection-started before poll
1282 => pg::connection-made after poll
1283 => pg::connection-made before poll
1284 => pg::connection-awaiting-response after poll
1285 => pg::connection-awaiting-response before poll
1286 => pg::connection-auth-ok after poll
1287 => pg::connection-auth-ok before poll
1288 => pg::connection-setenv after poll
1289 => pg::connection-setenv before poll
1290 => pg::connection-ok after poll
1293 => #<PGconn localhost:25432 steve/steve>
1295 Here is an example of one method of doing an asynchronous reset.
1297 (defun database-poller (P)
1299 (message "%S before poll" (pq-pgconn P 'pq::status))
1300 (setq PS (pq-reset-poll P))
1301 (message "%S after poll [%S]" (pq-pgconn P 'pq::status) PS)
1302 (if (eq (pq-pgconn P 'pq::status) 'pg::connection-ok)
1304 (add-timeout .1 'database-poller P))))
1308 (add-timeout .1 'database-poller P))
1309 => pg::connection-started before poll
1310 => pg::connection-made after poll [pgres::polling-writing]
1311 => pg::connection-made before poll
1312 => pg::connection-awaiting-response after poll [pgres::polling-reading]
1313 => pg::connection-awaiting-response before poll
1314 => pg::connection-setenv after poll [pgres::polling-reading]
1315 => pg::connection-setenv before poll
1316 => pg::connection-ok after poll [pgres::polling-ok]
1319 => #<PGconn localhost:25432 steve/steve>
1321 And finally, an asynchronous query.
1323 (defun database-poller (P)
1325 (pq-consume-input P)
1327 (add-timeout .1 'database-poller P)
1328 (setq R (pq-get-result P))
1331 (push R result-list)
1332 (add-timeout .1 'database-poller P))))))
1334 (when (pq-send-query P "SELECT * FROM xemacs_test;")
1335 (setq result-list nil)
1336 (add-timeout .1 'database-poller P))
1340 => (#<PGresult PGRES_TUPLES_OK - SELECT>)
1342 Here is an example showing how multiple SQL statements in a single
1343 query can have all their results collected.
1344 ;; Using the same `database-poller' function from the previous example
1345 (when (pq-send-query P "SELECT * FROM xemacs_test;
1346 SELECT * FROM pg_database;
1347 SELECT * FROM pg_user;")
1348 (setq result-list nil)
1349 (add-timeout .1 'database-poller P))
1353 => (#<PGresult PGRES_TUPLES_OK - SELECT> #<PGresult PGRES_TUPLES_OK - SELECT> #<PGresult PGRES_TUPLES_OK - SELECT>)
1355 Here is an example which illustrates collecting all data from a
1356 query, including the field names.
1358 (defun pg-util-query-results (results)
1359 "Retrieve results of last SQL query into a list structure."
1360 (let ((i (1- (pq-ntuples R)))
1363 (setq j (1- (pq-nfields R)))
1366 (push (pq-get-value R i j) l2)
1370 (setq j (1- (pq-nfields R)))
1373 (push (pq-fname R j) l2)
1377 => pg-util-query-results
1378 (setq R (pq-exec P "SELECT * FROM xemacs_test ORDER BY field2 DESC;"))
1379 => #<PGresult PGRES_TUPLES_OK - SELECT>
1380 (pg-util-query-results R)
1381 => (("f1" "field2") ("a" "97") ("b" "97") ("stuff" "42") ("a string" "12") ("foo" "10") ("string" "2") ("text" "1"))
1383 Here is an example of a query that uses a database cursor.
1386 (setq R (pq-exec P "BEGIN;"))
1387 (setq R (pq-exec P "DECLARE k_cursor CURSOR FOR SELECT * FROM xemacs_test ORDER BY f1 DESC;"))
1389 (setq R (pq-exec P "FETCH k_cursor;"))
1390 (while (eq (pq-ntuples R) 1)
1391 (push (list (pq-get-value R 0 0) (pq-get-value R 0 1)) data)
1392 (setq R (pq-exec P "FETCH k_cursor;")))
1393 (setq R (pq-exec P "END;"))
1395 => (("a" "97") ("a string" "12") ("b" "97") ("foo" "10") ("string" "2") ("stuff" "42") ("text" "1"))
1397 Here's another example of cursors, this time with a Lisp macro to
1398 implement a mapping function over a table.
1400 (defmacro map-db (P table condition callout)
1402 (pq-exec ,P "BEGIN;")
1403 (pq-exec ,P (concat "DECLARE k_cursor CURSOR FOR SELECT * FROM "
1407 " ORDER BY f1 DESC;"))
1408 (setq R (pq-exec P "FETCH k_cursor;"))
1409 (while (eq (pq-ntuples R) 1)
1410 (,callout (pq-get-value R 0 0) (pq-get-value R 0 1))
1411 (setq R (pq-exec P "FETCH k_cursor;")))
1412 (pq-exec P "END;")))
1414 (defun callback (arg1 arg2)
1415 (message "arg1 = %s, arg2 = %s" arg1 arg2))
1417 (map-db P "xemacs_test" "WHERE field2 > 10" callback)
1418 => arg1 = stuff, arg2 = 42
1419 => arg1 = b, arg2 = 97
1420 => arg1 = a string, arg2 = 12
1421 => arg1 = a, arg2 = 97
1422 => #<PGresult PGRES_COMMAND_OK - COMMIT>
1425 File: lispref.info, Node: Internationalization, Next: MULE, Prev: PostgreSQL Support, Up: Top
1427 Internationalization
1428 ********************
1432 * I18N Levels 1 and 2:: Support for different time, date, and currency formats.
1433 * I18N Level 3:: Support for localized messages.
1434 * I18N Level 4:: Support for Asian languages.
1437 File: lispref.info, Node: I18N Levels 1 and 2, Next: I18N Level 3, Up: Internationalization
1442 XEmacs is now compliant with I18N levels 1 and 2. Specifically, this
1443 means that it is 8-bit clean and correctly handles time and date
1444 functions. XEmacs will correctly display the entire ISO-Latin 1
1447 The compose key may now be used to create any character in the
1448 ISO-Latin 1 character set not directly available via the keyboard.. In
1449 order for the compose key to work it is necessary to load the file
1450 `x-compose.el'. At any time while composing a character, `C-h' will
1451 display all valid completions and the character which would be produced.
1454 File: lispref.info, Node: I18N Level 3, Next: I18N Level 4, Prev: I18N Levels 1 and 2, Up: Internationalization
1462 * Level 3 Primitives::
1463 * Dynamic Messaging::
1464 * Domain Specification::
1465 * Documentation String Extraction::
1468 File: lispref.info, Node: Level 3 Basics, Next: Level 3 Primitives, Up: I18N Level 3
1473 XEmacs now provides alpha-level functionality for I18N Level 3. This
1474 means that everything necessary for full messaging is available, but
1475 not every file has been converted.
1477 The two message files which have been created are `src/emacs.po' and
1478 `lisp/packages/mh-e.po'. Both files need to be converted using
1479 `msgfmt', and the resulting `.mo' files placed in some locale's
1480 `LC_MESSAGES' directory. The test "translations" in these files are
1481 the original messages prefixed by `TRNSLT_'.
1483 The domain for a variable is stored on the variable's property list
1484 under the property name VARIABLE-DOMAIN. The function
1485 `documentation-property' uses this information when translating a
1486 variable's documentation.
1489 File: lispref.info, Node: Level 3 Primitives, Next: Dynamic Messaging, Prev: Level 3 Basics, Up: I18N Level 3
1494 - Function: gettext string
1495 This function looks up STRING in the default message domain and
1496 returns its translation. If `I18N3' was not enabled when XEmacs
1497 was compiled, it just returns STRING.
1499 - Function: dgettext domain string
1500 This function looks up STRING in the specified message domain and
1501 returns its translation. If `I18N3' was not enabled when XEmacs
1502 was compiled, it just returns STRING.
1504 - Function: bind-text-domain domain pathname
1505 This function associates a pathname with a message domain. Here's
1506 how the path to message file is constructed under SunOS 5.x:
1508 `{pathname}/{LANG}/LC_MESSAGES/{domain}.mo'
1510 If `I18N3' was not enabled when XEmacs was compiled, this function
1513 - Special Form: domain string
1514 This function specifies the text domain used for translating
1515 documentation strings and interactive prompts of a function. For
1518 (defun foo (arg) "Doc string" (domain "emacs-foo") ...)
1520 to specify `emacs-foo' as the text domain of the function `foo'.
1521 The "call" to `domain' is actually a declaration rather than a
1522 function; when actually called, `domain' just returns `nil'.
1524 - Function: domain-of function
1525 This function returns the text domain of FUNCTION; it returns
1526 `nil' if it is the default domain. If `I18N3' was not enabled
1527 when XEmacs was compiled, it always returns `nil'.
1530 File: lispref.info, Node: Dynamic Messaging, Next: Domain Specification, Prev: Level 3 Primitives, Up: I18N Level 3
1535 The `format' function has been extended to permit you to change the
1536 order of parameter insertion. For example, the conversion format
1537 `%1$s' inserts parameter one as a string, while `%2$s' inserts
1538 parameter two. This is useful when creating translations which require
1539 you to change the word order.
1542 File: lispref.info, Node: Domain Specification, Next: Documentation String Extraction, Prev: Dynamic Messaging, Up: I18N Level 3
1544 Domain Specification
1545 --------------------
1547 The default message domain of XEmacs is `emacs'. For add-on packages,
1548 it is best to use a different domain. For example, let us say we want
1549 to convert the "gorilla" package to use the domain `emacs-gorilla'. To
1550 translate the message "What gorilla?", use `dgettext' as follows:
1552 (dgettext "emacs-gorilla" "What gorilla?")
1554 A function (or macro) which has a documentation string or an
1555 interactive prompt needs to be associated with the domain in order for
1556 the documentation or prompt to be translated. This is done with the
1557 `domain' special form as follows:
1559 (defun scratch (location)
1560 "Scratch the specified location."
1561 (domain "emacs-gorilla")
1562 (interactive "sScratch: ")
1565 It is most efficient to specify the domain in the first line of the
1566 function body, before the `interactive' form.
1568 For variables and constants which have documentation strings,
1569 specify the domain after the documentation.
1571 - Special Form: defvar symbol [value [doc-string [domain]]]
1573 (defvar weight 250 "Weight of gorilla, in pounds." "emacs-gorilla")
1575 - Special Form: defconst symbol [value [doc-string [domain]]]
1577 (defconst limbs 4 "Number of limbs" "emacs-gorilla")
1579 - Function: autoload function filename &optional docstring interactive
1581 This function defines FUNCTION to autoload from FILENAME Example:
1582 (autoload 'explore "jungle" "Explore the jungle." nil nil "emacs-gorilla")
1585 File: lispref.info, Node: Documentation String Extraction, Prev: Domain Specification, Up: I18N Level 3
1587 Documentation String Extraction
1588 -------------------------------
1590 The utility `etc/make-po' scans the file `DOC' to extract documentation
1591 strings and creates a message file `doc.po'. This file may then be
1592 inserted within `emacs.po'.
1594 Currently, `make-po' is hard-coded to read from `DOC' and write to
1595 `doc.po'. In order to extract documentation strings from an add-on
1596 package, first run `make-docfile' on the package to produce the `DOC'
1597 file. Then run `make-po -p' with the `-p' argument to indicate that we
1598 are extracting documentation for an add-on package.
1600 (The `-p' argument is a kludge to make up for a subtle difference
1601 between pre-loaded documentation and add-on documentation: For add-on
1602 packages, the final carriage returns in the strings produced by
1603 `make-docfile' must be ignored.)
1606 File: lispref.info, Node: I18N Level 4, Prev: I18N Level 3, Up: Internationalization
1611 The Asian-language support in XEmacs is called "MULE". *Note MULE::.
1614 File: lispref.info, Node: MULE, Next: Tips, Prev: Internationalization, Up: Top
1619 "MULE" is the name originally given to the version of GNU Emacs
1620 extended for multi-lingual (and in particular Asian-language) support.
1621 "MULE" is short for "MUlti-Lingual Emacs". It is an extension and
1622 complete rewrite of Nemacs ("Nihon Emacs" where "Nihon" is the Japanese
1623 word for "Japan"), which only provided support for Japanese. XEmacs
1624 refers to its multi-lingual support as "MULE support" since it is based
1629 * Internationalization Terminology::
1630 Definition of various internationalization terms.
1631 * Charsets:: Sets of related characters.
1632 * MULE Characters:: Working with characters in XEmacs/MULE.
1633 * Composite Characters:: Making new characters by overstriking other ones.
1634 * Coding Systems:: Ways of representing a string of chars using integers.
1635 * CCL:: A special language for writing fast converters.
1636 * Category Tables:: Subdividing charsets into groups.
1639 File: lispref.info, Node: Internationalization Terminology, Next: Charsets, Up: MULE
1641 Internationalization Terminology
1642 ================================
1644 In internationalization terminology, a string of text is divided up
1645 into "characters", which are the printable units that make up the text.
1646 A single character is (for example) a capital `A', the number `2', a
1647 Katakana character, a Hangul character, a Kanji ideograph (an
1648 "ideograph" is a "picture" character, such as is used in Japanese
1649 Kanji, Chinese Hanzi, and Korean Hanja; typically there are thousands
1650 of such ideographs in each language), etc. The basic property of a
1651 character is that it is the smallest unit of text with semantic
1652 significance in text processing.
1654 Human beings normally process text visually, so to a first
1655 approximation a character may be identified with its shape. Note that
1656 the same character may be drawn by two different people (or in two
1657 different fonts) in slightly different ways, although the "basic shape"
1658 will be the same. But consider the works of Scott Kim; human beings
1659 can recognize hugely variant shapes as the "same" character.
1660 Sometimes, especially where characters are extremely complicated to
1661 write, completely different shapes may be defined as the "same"
1662 character in national standards. The Taiwanese variant of Hanzi is
1663 generally the most complicated; over the centuries, the Japanese,
1664 Koreans, and the People's Republic of China have adopted
1665 simplifications of the shape, but the line of descent from the original
1666 shape is recorded, and the meanings and pronunciation of different
1667 forms of the same character are considered to be identical within each
1668 language. (Of course, it may take a specialist to recognize the
1669 related form; the point is that the relations are standardized, despite
1670 the differing shapes.)
1672 In some cases, the differences will be significant enough that it is
1673 actually possible to identify two or more distinct shapes that both
1674 represent the same character. For example, the lowercase letters `a'
1675 and `g' each have two distinct possible shapes--the `a' can optionally
1676 have a curved tail projecting off the top, and the `g' can be formed
1677 either of two loops, or of one loop and a tail hanging off the bottom.
1678 Such distinct possible shapes of a character are called "glyphs". The
1679 important characteristic of two glyphs making up the same character is
1680 that the choice between one or the other is purely stylistic and has no
1681 linguistic effect on a word (this is the reason why a capital `A' and
1682 lowercase `a' are different characters rather than different
1683 glyphs--e.g. `Aspen' is a city while `aspen' is a kind of tree).
1685 Note that "character" and "glyph" are used differently here than
1686 elsewhere in XEmacs.
1688 A "character set" is essentially a set of related characters. ASCII,
1689 for example, is a set of 94 characters (or 128, if you count
1690 non-printing characters). Other character sets are ISO8859-1 (ASCII
1691 plus various accented characters and other international symbols), JIS
1692 X 0201 (ASCII, more or less, plus half-width Katakana), JIS X 0208
1693 (Japanese Kanji), JIS X 0212 (a second set of less-used Japanese Kanji),
1694 GB2312 (Mainland Chinese Hanzi), etc.
1696 The definition of a character set will implicitly or explicitly give
1697 it an "ordering", a way of assigning a number to each character in the
1698 set. For many character sets, there is a natural ordering, for example
1699 the "ABC" ordering of the Roman letters. But it is not clear whether
1700 digits should come before or after the letters, and in fact different
1701 European languages treat the ordering of accented characters
1702 differently. It is useful to use the natural order where available, of
1703 course. The number assigned to any particular character is called the
1704 character's "code point". (Within a given character set, each
1705 character has a unique code point. Thus the word "set" is ill-chosen;
1706 different orderings of the same characters are different character sets.
1707 Identifying characters is simple enough for alphabetic character sets,
1708 but the difference in ordering can cause great headaches when the same
1709 thousands of characters are used by different cultures as in the Hanzi.)
1711 A code point may be broken into a number of "position codes". The
1712 number of position codes required to index a particular character in a
1713 character set is called the "dimension" of the character set. For
1714 practical purposes, a position code may be thought of as a byte-sized
1715 index. The printing characters of ASCII, being a relatively small
1716 character set, is of dimension one, and each character in the set is
1717 indexed using a single position code, in the range 1 through 94. Use of
1718 this unusual range, rather than the familiar 33 through 126, is an
1719 intentional abstraction; to understand the programming issues you must
1720 break the equation between character sets and encodings.
1722 JIS X 0208, i.e. Japanese Kanji, has thousands of characters, and is
1723 of dimension two - every character is indexed by two position codes,
1724 each in the range 1 through 94. (This number "94" is not a
1725 coincidence; we shall see that the JIS position codes were chosen so
1726 that JIS kanji could be encoded without using codes that in ASCII are
1727 associated with device control functions.) Note that the choice of the
1728 range here is somewhat arbitrary. You could just as easily index the
1729 printing characters in ASCII using numbers in the range 0 through 93, 2
1730 through 95, 3 through 96, etc. In fact, the standardized _encoding_
1731 for the ASCII _character set_ uses the range 33 through 126.
1733 An "encoding" is a way of numerically representing characters from
1734 one or more character sets into a stream of like-sized numerical values
1735 called "words"; typically these are 8-bit, 16-bit, or 32-bit
1736 quantities. If an encoding encompasses only one character set, then the
1737 position codes for the characters in that character set could be used
1738 directly. (This is the case with the trivial cipher used by children,
1739 assigning 1 to `A', 2 to `B', and so on.) However, even with ASCII,
1740 other considerations intrude. For example, why are the upper- and
1741 lowercase alphabets separated by 8 characters? Why do the digits start
1742 with `0' being assigned the code 48? In both cases because semantically
1743 interesting operations (case conversion and numerical value extraction)
1744 become convenient masking operations. Other artificial aspects (the
1745 control characters being assigned to codes 0-31 and 127) are historical
1746 accidents. (The use of 127 for `DEL' is an artifact of the "punch
1747 once" nature of paper tape, for example.)
1749 Naive use of the position code is not possible, however, if more than
1750 one character set is to be used in the encoding. For example, printed
1751 Japanese text typically requires characters from multiple character sets
1752 - ASCII, JIS X 0208, and JIS X 0212, to be specific. Each of these is
1753 indexed using one or more position codes in the range 1 through 94, so
1754 the position codes could not be used directly or there would be no way
1755 to tell which character was meant. Different Japanese encodings handle
1756 this differently - JIS uses special escape characters to denote
1757 different character sets; EUC sets the high bit of the position codes
1758 for JIS X 0208 and JIS X 0212, and puts a special extra byte before each
1759 JIS X 0212 character; etc. (JIS, EUC, and most of the other encodings
1760 you will encounter in files are 7-bit or 8-bit encodings. There is one
1761 common 16-bit encoding, which is Unicode; this strives to represent all
1762 the world's characters in a single large character set. 32-bit
1763 encodings are often used internally in programs, such as XEmacs with
1764 MULE support, to simplify the code that manipulates them; however, they
1765 are not used externally because they are not very space-efficient.)
1767 A general method of handling text using multiple character sets
1768 (whether for multilingual text, or simply text in an extremely
1769 complicated single language like Japanese) is defined in the
1770 international standard ISO 2022. ISO 2022 will be discussed in more
1771 detail later (*note ISO 2022::), but for now suffice it to say that text
1772 needs control functions (at least spacing), and if escape sequences are
1773 to be used, an escape sequence introducer. It was decided to make all
1774 text streams compatible with ASCII in the sense that the codes 0-31
1775 (and 128-159) would always be control codes, never graphic characters,
1776 and where defined by the character set the `SPC' character would be
1777 assigned code 32, and `DEL' would be assigned 127. Thus there are 94
1778 code points remaining if 7 bits are used. This is the reason that most
1779 character sets are defined using position codes in the range 1 through
1780 94. Then ISO 2022 compatible encodings are produced by shifting the
1781 position codes 1 to 94 into character codes 33 to 126, or (if 8 bit
1782 codes are available) into character codes 161 to 254.
1784 Encodings are classified as either "modal" or "non-modal". In a
1785 "modal encoding", there are multiple states that the encoding can be
1786 in, and the interpretation of the values in the stream depends on the
1787 current global state of the encoding. Special values in the encoding,
1788 called "escape sequences", are used to change the global state. JIS,
1789 for example, is a modal encoding. The bytes `ESC $ B' indicate that,
1790 from then on, bytes are to be interpreted as position codes for JIS X
1791 0208, rather than as ASCII. This effect is cancelled using the bytes
1792 `ESC ( B', which mean "switch from whatever the current state is to
1793 ASCII". To switch to JIS X 0212, the escape sequence `ESC $ ( D'.
1794 (Note that here, as is common, the escape sequences do in fact begin
1795 with `ESC'. This is not necessarily the case, however. Some encodings
1796 use control characters called "locking shifts" (effect persists until
1797 cancelled) to switch character sets.)
1799 A "non-modal encoding" has no global state that extends past the
1800 character currently being interpreted. EUC, for example, is a
1801 non-modal encoding. Characters in JIS X 0208 are encoded by setting
1802 the high bit of the position codes, and characters in JIS X 0212 are
1803 encoded by doing the same but also prefixing the character with the
1806 The advantage of a modal encoding is that it is generally more
1807 space-efficient, and is easily extendible because there are essentially
1808 an arbitrary number of escape sequences that can be created. The
1809 disadvantage, however, is that it is much more difficult to work with
1810 if it is not being processed in a sequential manner. In the non-modal
1811 EUC encoding, for example, the byte 0x41 always refers to the letter
1812 `A'; whereas in JIS, it could either be the letter `A', or one of the
1813 two position codes in a JIS X 0208 character, or one of the two
1814 position codes in a JIS X 0212 character. Determining exactly which
1815 one is meant could be difficult and time-consuming if the previous
1816 bytes in the string have not already been processed, or impossible if
1817 they are drawn from an external stream that cannot be rewound.
1819 Non-modal encodings are further divided into "fixed-width" and
1820 "variable-width" formats. A fixed-width encoding always uses the same
1821 number of words per character, whereas a variable-width encoding does
1822 not. EUC is a good example of a variable-width encoding: one to three
1823 bytes are used per character, depending on the character set. 16-bit
1824 and 32-bit encodings are nearly always fixed-width, and this is in fact
1825 one of the main reasons for using an encoding with a larger word size.
1826 The advantages of fixed-width encodings should be obvious. The
1827 advantages of variable-width encodings are that they are generally more
1828 space-efficient and allow for compatibility with existing 8-bit
1829 encodings such as ASCII. (For example, in Unicode ASCII characters are
1830 simply promoted to a 16-bit representation. That means that every
1831 ASCII character contains a `NUL' byte; evidently all of the standard
1832 string manipulation functions will lose badly in a fixed-width Unicode
1835 The bytes in an 8-bit encoding are often referred to as "octets"
1836 rather than simply as bytes. This terminology dates back to the days
1837 before 8-bit bytes were universal, when some computers had 9-bit bytes,
1838 others had 10-bit bytes, etc.
1841 File: lispref.info, Node: Charsets, Next: MULE Characters, Prev: Internationalization Terminology, Up: MULE
1846 A "charset" in MULE is an object that encapsulates a particular
1847 character set as well as an ordering of those characters. Charsets are
1848 permanent objects and are named using symbols, like faces.
1850 - Function: charsetp object
1851 This function returns non-`nil' if OBJECT is a charset.
1855 * Charset Properties:: Properties of a charset.
1856 * Basic Charset Functions:: Functions for working with charsets.
1857 * Charset Property Functions:: Functions for accessing charset properties.
1858 * Predefined Charsets:: Predefined charset objects.
1861 File: lispref.info, Node: Charset Properties, Next: Basic Charset Functions, Up: Charsets
1866 Charsets have the following properties:
1869 A symbol naming the charset. Every charset must have a different
1870 name; this allows a charset to be referred to using its name
1871 rather than the actual charset object.
1874 A documentation string describing the charset.
1877 A regular expression matching the font registry field for this
1878 character set. For example, both the `ascii' and `latin-iso8859-1'
1879 charsets use the registry `"ISO8859-1"'. This field is used to
1880 choose an appropriate font when the user gives a general font
1881 specification such as `-*-courier-medium-r-*-140-*', i.e. a
1882 14-point upright medium-weight Courier font.
1885 Number of position codes used to index a character in the
1886 character set. XEmacs/MULE can only handle character sets of
1887 dimension 1 or 2. This property defaults to 1.
1890 Number of characters in each dimension. In XEmacs/MULE, the only
1891 allowed values are 94 or 96. (There are a couple of pre-defined
1892 character sets, such as ASCII, that do not follow this, but you
1893 cannot define new ones like this.) Defaults to 94. Note that if
1894 the dimension is 2, the character set thus described is 94x94 or
1898 Number of columns used to display a character in this charset.
1899 Only used in TTY mode. (Under X, the actual width of a character
1900 can be derived from the font used to display the characters.) If
1901 unspecified, defaults to the dimension. (This is almost always the
1902 correct value, because character sets with dimension 2 are usually
1903 ideograph character sets, which need two columns to display the
1904 intricate ideographs.)
1907 A symbol, either `l2r' (left-to-right) or `r2l' (right-to-left).
1908 Defaults to `l2r'. This specifies the direction that the text
1909 should be displayed in, and will be left-to-right for most
1910 charsets but right-to-left for Hebrew and Arabic. (Right-to-left
1911 display is not currently implemented.)
1914 Final byte of the standard ISO 2022 escape sequence designating
1915 this charset. Must be supplied. Each combination of (DIMENSION,
1916 CHARS) defines a separate namespace for final bytes, and each
1917 charset within a particular namespace must have a different final
1918 byte. Note that ISO 2022 restricts the final byte to the range
1919 0x30 - 0x7E if dimension == 1, and 0x30 - 0x5F if dimension == 2.
1920 Note also that final bytes in the range 0x30 - 0x3F are reserved
1921 for user-defined (not official) character sets. For more
1922 information on ISO 2022, see *Note Coding Systems::.
1925 0 (use left half of font on output) or 1 (use right half of font on
1926 output). Defaults to 0. This specifies how to convert the
1927 position codes that index a character in a character set into an
1928 index into the font used to display the character set. With
1929 `graphic' set to 0, position codes 33 through 126 map to font
1930 indices 33 through 126; with it set to 1, position codes 33
1931 through 126 map to font indices 161 through 254 (i.e. the same
1932 number but with the high bit set). For example, for a font whose
1933 registry is ISO8859-1, the left half of the font (octets 0x20 -
1934 0x7F) is the `ascii' charset, while the right half (octets 0xA0 -
1935 0xFF) is the `latin-iso8859-1' charset.
1938 A compiled CCL program used to convert a character in this charset
1939 into an index into the font. This is in addition to the `graphic'
1940 property. If a CCL program is defined, the position codes of a
1941 character will first be processed according to `graphic' and then
1942 passed through the CCL program, with the resulting values used to
1945 This is used, for example, in the Big5 character set (used in
1946 Taiwan). This character set is not ISO-2022-compliant, and its
1947 size (94x157) does not fit within the maximum 96x96 size of
1948 ISO-2022-compliant character sets. As a result, XEmacs/MULE
1949 splits it (in a rather complex fashion, so as to group the most
1950 commonly used characters together) into two charset objects
1951 (`big5-1' and `big5-2'), each of size 94x94, and each charset
1952 object uses a CCL program to convert the modified position codes
1953 back into standard Big5 indices to retrieve a character from a
1956 Most of the above properties can only be set when the charset is
1957 initialized, and cannot be changed later. *Note Charset Property
1961 File: lispref.info, Node: Basic Charset Functions, Next: Charset Property Functions, Prev: Charset Properties, Up: Charsets
1963 Basic Charset Functions
1964 -----------------------
1966 - Function: find-charset charset-or-name
1967 This function retrieves the charset of the given name. If
1968 CHARSET-OR-NAME is a charset object, it is simply returned.
1969 Otherwise, CHARSET-OR-NAME should be a symbol. If there is no
1970 such charset, `nil' is returned. Otherwise the associated charset
1973 - Function: get-charset name
1974 This function retrieves the charset of the given name. Same as
1975 `find-charset' except an error is signalled if there is no such
1976 charset instead of returning `nil'.
1978 - Function: charset-list
1979 This function returns a list of the names of all defined charsets.
1981 - Function: make-charset name doc-string props
1982 This function defines a new character set. This function is for
1983 use with MULE support. NAME is a symbol, the name by which the
1984 character set is normally referred. DOC-STRING is a string
1985 describing the character set. PROPS is a property list,
1986 describing the specific nature of the character set. The
1987 recognized properties are `registry', `dimension', `columns',
1988 `chars', `final', `graphic', `direction', and `ccl-program', as
1989 previously described.
1991 - Function: make-reverse-direction-charset charset new-name
1992 This function makes a charset equivalent to CHARSET but which goes
1993 in the opposite direction. NEW-NAME is the name of the new
1994 charset. The new charset is returned.
1996 - Function: charset-from-attributes dimension chars final &optional
1998 This function returns a charset with the given DIMENSION, CHARS,
1999 FINAL, and DIRECTION. If DIRECTION is omitted, both directions
2000 will be checked (left-to-right will be returned if character sets
2001 exist for both directions).
2003 - Function: charset-reverse-direction-charset charset
2004 This function returns the charset (if any) with the same dimension,
2005 number of characters, and final byte as CHARSET, but which is
2006 displayed in the opposite direction.
2009 File: lispref.info, Node: Charset Property Functions, Next: Predefined Charsets, Prev: Basic Charset Functions, Up: Charsets
2011 Charset Property Functions
2012 --------------------------
2014 All of these functions accept either a charset name or charset object.
2016 - Function: charset-property charset prop
2017 This function returns property PROP of CHARSET. *Note Charset
2020 Convenience functions are also provided for retrieving individual
2021 properties of a charset.
2023 - Function: charset-name charset
2024 This function returns the name of CHARSET. This will be a symbol.
2026 - Function: charset-description charset
2027 This function returns the documentation string of CHARSET.
2029 - Function: charset-registry charset
2030 This function returns the registry of CHARSET.
2032 - Function: charset-dimension charset
2033 This function returns the dimension of CHARSET.
2035 - Function: charset-chars charset
2036 This function returns the number of characters per dimension of
2039 - Function: charset-width charset
2040 This function returns the number of display columns per character
2041 (in TTY mode) of CHARSET.
2043 - Function: charset-direction charset
2044 This function returns the display direction of CHARSET--either
2047 - Function: charset-iso-final-char charset
2048 This function returns the final byte of the ISO 2022 escape
2049 sequence designating CHARSET.
2051 - Function: charset-iso-graphic-plane charset
2052 This function returns either 0 or 1, depending on whether the
2053 position codes of characters in CHARSET map to the left or right
2054 half of their font, respectively.
2056 - Function: charset-ccl-program charset
2057 This function returns the CCL program, if any, for converting
2058 position codes of characters in CHARSET into font indices.
2060 The two properties of a charset that can currently be set after the
2061 charset has been created are the CCL program and the font registry.
2063 - Function: set-charset-ccl-program charset ccl-program
2064 This function sets the `ccl-program' property of CHARSET to
2067 - Function: set-charset-registry charset registry
2068 This function sets the `registry' property of CHARSET to REGISTRY.
2071 File: lispref.info, Node: Predefined Charsets, Prev: Charset Property Functions, Up: Charsets
2076 The following charsets are predefined in the C code.
2078 Name Type Fi Gr Dir Registry
2079 --------------------------------------------------------------
2080 ascii 94 B 0 l2r ISO8859-1
2081 control-1 94 0 l2r ---
2082 latin-iso8859-1 94 A 1 l2r ISO8859-1
2083 latin-iso8859-2 96 B 1 l2r ISO8859-2
2084 latin-iso8859-3 96 C 1 l2r ISO8859-3
2085 latin-iso8859-4 96 D 1 l2r ISO8859-4
2086 cyrillic-iso8859-5 96 L 1 l2r ISO8859-5
2087 arabic-iso8859-6 96 G 1 r2l ISO8859-6
2088 greek-iso8859-7 96 F 1 l2r ISO8859-7
2089 hebrew-iso8859-8 96 H 1 r2l ISO8859-8
2090 latin-iso8859-9 96 M 1 l2r ISO8859-9
2091 thai-tis620 96 T 1 l2r TIS620
2092 katakana-jisx0201 94 I 1 l2r JISX0201.1976
2093 latin-jisx0201 94 J 0 l2r JISX0201.1976
2094 japanese-jisx0208-1978 94x94 @ 0 l2r JISX0208.1978
2095 japanese-jisx0208 94x94 B 0 l2r JISX0208.19(83|90)
2096 japanese-jisx0212 94x94 D 0 l2r JISX0212
2097 chinese-gb2312 94x94 A 0 l2r GB2312
2098 chinese-cns11643-1 94x94 G 0 l2r CNS11643.1
2099 chinese-cns11643-2 94x94 H 0 l2r CNS11643.2
2100 chinese-big5-1 94x94 0 0 l2r Big5
2101 chinese-big5-2 94x94 1 0 l2r Big5
2102 korean-ksc5601 94x94 C 0 l2r KSC5601
2103 composite 96x96 0 l2r ---
2105 The following charsets are predefined in the Lisp code.
2107 Name Type Fi Gr Dir Registry
2108 --------------------------------------------------------------
2109 arabic-digit 94 2 0 l2r MuleArabic-0
2110 arabic-1-column 94 3 0 r2l MuleArabic-1
2111 arabic-2-column 94 4 0 r2l MuleArabic-2
2112 sisheng 94 0 0 l2r sisheng_cwnn\|OMRON_UDC_ZH
2113 chinese-cns11643-3 94x94 I 0 l2r CNS11643.1
2114 chinese-cns11643-4 94x94 J 0 l2r CNS11643.1
2115 chinese-cns11643-5 94x94 K 0 l2r CNS11643.1
2116 chinese-cns11643-6 94x94 L 0 l2r CNS11643.1
2117 chinese-cns11643-7 94x94 M 0 l2r CNS11643.1
2118 ethiopic 94x94 2 0 l2r Ethio
2119 ascii-r2l 94 B 0 r2l ISO8859-1
2120 ipa 96 0 1 l2r MuleIPA
2121 vietnamese-viscii-lower 96 1 1 l2r VISCII1.1
2122 vietnamese-viscii-upper 96 2 1 l2r VISCII1.1
2124 For all of the above charsets, the dimension and number of columns
2127 Note that ASCII, Control-1, and Composite are handled specially.
2128 This is why some of the fields are blank; and some of the filled-in
2129 fields (e.g. the type) are not really accurate.
2132 File: lispref.info, Node: MULE Characters, Next: Composite Characters, Prev: Charsets, Up: MULE
2137 - Function: make-char charset arg1 &optional arg2
2138 This function makes a multi-byte character from CHARSET and octets
2141 - Function: char-charset character
2142 This function returns the character set of char CHARACTER.
2144 - Function: char-octet character &optional n
2145 This function returns the octet (i.e. position code) numbered N
2146 (should be 0 or 1) of char CHARACTER. N defaults to 0 if omitted.
2148 - Function: find-charset-region start end &optional buffer
2149 This function returns a list of the charsets in the region between
2150 START and END. BUFFER defaults to the current buffer if omitted.
2152 - Function: find-charset-string string
2153 This function returns a list of the charsets in STRING.
2156 File: lispref.info, Node: Composite Characters, Next: Coding Systems, Prev: MULE Characters, Up: MULE
2158 Composite Characters
2159 ====================
2161 Composite characters are not yet completely implemented.
2163 - Function: make-composite-char string
2164 This function converts a string into a single composite character.
2165 The character is the result of overstriking all the characters in
2168 - Function: composite-char-string character
2169 This function returns a string of the characters comprising a
2170 composite character.
2172 - Function: compose-region start end &optional buffer
2173 This function composes the characters in the region from START to
2174 END in BUFFER into one composite character. The composite
2175 character replaces the composed characters. BUFFER defaults to
2176 the current buffer if omitted.
2178 - Function: decompose-region start end &optional buffer
2179 This function decomposes any composite characters in the region
2180 from START to END in BUFFER. This converts each composite
2181 character into one or more characters, the individual characters
2182 out of which the composite character was formed. Non-composite
2183 characters are left as-is. BUFFER defaults to the current buffer
2187 File: lispref.info, Node: Coding Systems, Next: CCL, Prev: Composite Characters, Up: MULE
2192 A coding system is an object that defines how text containing multiple
2193 character sets is encoded into a stream of (typically 8-bit) bytes. The
2194 coding system is used to decode the stream into a series of characters
2195 (which may be from multiple charsets) when the text is read from a file
2196 or process, and is used to encode the text back into the same format
2197 when it is written out to a file or process.
2199 For example, many ISO-2022-compliant coding systems (such as Compound
2200 Text, which is used for inter-client data under the X Window System) use
2201 escape sequences to switch between different charsets - Japanese Kanji,
2202 for example, is invoked with `ESC $ ( B'; ASCII is invoked with `ESC (
2203 B'; and Cyrillic is invoked with `ESC - L'. See `make-coding-system'
2204 for more information.
2206 Coding systems are normally identified using a symbol, and the
2207 symbol is accepted in place of the actual coding system object whenever
2208 a coding system is called for. (This is similar to how faces and
2211 - Function: coding-system-p object
2212 This function returns non-`nil' if OBJECT is a coding system.
2216 * Coding System Types:: Classifying coding systems.
2217 * ISO 2022:: An international standard for
2218 charsets and encodings.
2219 * EOL Conversion:: Dealing with different ways of denoting
2221 * Coding System Properties:: Properties of a coding system.
2222 * Basic Coding System Functions:: Working with coding systems.
2223 * Coding System Property Functions:: Retrieving a coding system's properties.
2224 * Encoding and Decoding Text:: Encoding and decoding text.
2225 * Detection of Textual Encoding:: Determining how text is encoded.
2226 * Big5 and Shift-JIS Functions:: Special functions for these non-standard
2228 * Predefined Coding Systems:: Coding systems implemented by MULE.
2231 File: lispref.info, Node: Coding System Types, Next: ISO 2022, Up: Coding Systems
2236 The coding system type determines the basic algorithm XEmacs will use to
2237 decode or encode a data stream. Character encodings will be converted
2238 to the MULE encoding, escape sequences processed, and newline sequences
2239 converted to XEmacs's internal representation. There are three basic
2240 classes of coding system type: no-conversion, ISO-2022, and special.
2242 No conversion allows you to look at the file's internal
2243 representation. Since XEmacs is basically a text editor, "no
2244 conversion" does convert newline conventions by default. (Use the
2245 'binary coding-system if this is not desired.)
2247 ISO 2022 (*note ISO 2022::) is the basic international standard
2248 regulating use of "coded character sets for the exchange of data", ie,
2249 text streams. ISO 2022 contains functions that make it possible to
2250 encode text streams to comply with restrictions of the Internet mail
2251 system and de facto restrictions of most file systems (eg, use of the
2252 separator character in file names). Coding systems which are not ISO
2253 2022 conformant can be difficult to handle. Perhaps more important,
2254 they are not adaptable to multilingual information interchange, with
2255 the obvious exception of ISO 10646 (Unicode). (Unicode is partially
2256 supported by XEmacs with the addition of the Lisp package ucs-conv.)
2258 The special class of coding systems includes automatic detection,
2259 CCL (a "little language" embedded as an interpreter, useful for
2260 translating between variants of a single character set),
2261 non-ISO-2022-conformant encodings like Unicode, Shift JIS, and Big5,
2262 and MULE internal coding. (NB: this list is based on XEmacs 21.2.
2263 Terminology may vary slightly for other versions of XEmacs and for GNU
2267 No conversion, for binary files, and a few special cases of
2268 non-ISO-2022 coding systems where conversion is done by hook
2269 functions (usually implemented in CCL). On output, graphic
2270 characters that are not in ASCII or Latin-1 will be replaced by a
2271 `?'. (For a no-conversion-encoded buffer, these characters will
2272 only be present if you explicitly insert them.)
2275 Any ISO-2022-compliant encoding. Among others, this includes JIS
2276 (the Japanese encoding commonly used for e-mail), national
2277 variants of EUC (the standard Unix encoding for Japanese and other
2278 languages), and Compound Text (an encoding used in X11). You can
2279 specify more specific information about the conversion with the
2283 ISO 10646 UCS-4 encoding. A 31-bit fixed-width superset of
2287 ISO 10646 UTF-8 encoding. A "file system safe" transformation
2288 format that can be used with both UCS-4 and Unicode.
2291 Automatic conversion. XEmacs attempts to detect the coding system
2295 Shift-JIS (a Japanese encoding commonly used in PC operating
2299 Big5 (the encoding commonly used for Taiwanese).
2302 The conversion is performed using a user-written pseudo-code
2303 program. CCL (Code Conversion Language) is the name of this
2304 pseudo-code. For example, CCL is used to map KOI8-R characters
2305 (an encoding for Russian Cyrillic) to ISO8859-5 (the form used
2306 internally by MULE).
2309 Write out or read in the raw contents of the memory representing
2310 the buffer's text. This is primarily useful for debugging
2311 purposes, and is only enabled when XEmacs has been compiled with
2312 `DEBUG_XEMACS' set (the `--debug' configure option). *Warning*:
2313 Reading in a file using `internal' conversion can result in an
2314 internal inconsistency in the memory representing a buffer's text,
2315 which will produce unpredictable results and may cause XEmacs to
2316 crash. Under normal circumstances you should never use `internal'
2320 File: lispref.info, Node: ISO 2022, Next: EOL Conversion, Prev: Coding System Types, Up: Coding Systems
2325 This section briefly describes the ISO 2022 encoding standard. A more
2326 thorough treatment is available in the original document of ISO 2022 as
2327 well as various national standards (such as JIS X 0202).
2329 Character sets ("charsets") are classified into the following four
2330 categories, according to the number of characters in the charset:
2331 94-charset, 96-charset, 94x94-charset, and 96x96-charset. This means
2332 that although an ISO 2022 coding system may have variable width
2333 characters, each charset used is fixed-width (in contrast to the MULE
2334 character set and UTF-8, for example).
2336 ISO 2022 provides for switching between character sets via escape
2337 sequences. This switching is somewhat complicated, because ISO 2022
2338 provides for both legacy applications like Internet mail that accept
2339 only 7 significant bits in some contexts (RFC 822 headers, for example),
2340 and more modern "8-bit clean" applications. It also provides for
2341 compact and transparent representation of languages like Japanese which
2342 mix ASCII and a national script (even outside of computer programs).
2344 First, ISO 2022 codified prevailing practice by dividing the code
2345 space into "control" and "graphic" regions. The code points 0x00-0x1F
2346 and 0x80-0x9F are reserved for "control characters", while "graphic
2347 characters" must be assigned to code points in the regions 0x20-0x7F and
2348 0xA0-0xFF. The positions 0x20 and 0x7F are special, and under some
2349 circumstances must be assigned the graphic character "ASCII SPACE" and
2350 the control character "ASCII DEL" respectively.
2352 The various regions are given the name C0 (0x00-0x1F), GL
2353 (0x20-0x7F), C1 (0x80-0x9F), and GR (0xA0-0xFF). GL and GR stand for
2354 "graphic left" and "graphic right", respectively, because of the
2355 standard method of displaying graphic character sets in tables with the
2356 high byte indexing columns and the low byte indexing rows. I don't
2357 find it very intuitive, but these are called "registers".
2359 An ISO 2022-conformant encoding for a graphic character set must use
2360 a fixed number of bytes per character, and the values must fit into a
2361 single register; that is, each byte must range over either 0x20-0x7F, or
2362 0xA0-0xFF. It is not allowed to extend the range of the repertoire of a
2363 character set by using both ranges at the same. This is why a standard
2364 character set such as ISO 8859-1 is actually considered by ISO 2022 to
2365 be an aggregation of two character sets, ASCII and LATIN-1, and why it
2366 is technically incorrect to refer to ISO 8859-1 as "Latin 1". Also, a
2367 single character's bytes must all be drawn from the same register; this
2368 is why Shift JIS (for Japanese) and Big 5 (for Chinese) are not ISO
2369 2022-compatible encodings.
2371 The reason for this restriction becomes clear when you attempt to
2372 define an efficient, robust encoding for a language like Japanese.
2373 Like ISO 8859, Japanese encodings are aggregations of several character
2374 sets. In practice, the vast majority of characters are drawn from the
2375 "JIS Roman" character set (a derivative of ASCII; it won't hurt to
2376 think of it as ASCII) and the JIS X 0208 standard "basic Japanese"
2377 character set including not only ideographic characters ("kanji") but
2378 syllabic Japanese characters ("kana"), a wide variety of symbols, and
2379 many alphabetic characters (Roman, Greek, and Cyrillic) as well.
2380 Although JIS X 0208 includes the whole Roman alphabet, as a 2-byte code
2381 it is not suited to programming; thus the inclusion of ASCII in the
2382 standard Japanese encodings.
2384 For normal Japanese text such as in newspapers, a broad repertoire of
2385 approximately 3000 characters is used. Evidently this won't fit into
2386 one byte; two must be used. But much of the text processed by Japanese
2387 computers is computer source code, nearly all of which is ASCII. A not
2388 insignificant portion of ordinary text is English (as such or as
2389 borrowed Japanese vocabulary) or other languages which can represented
2390 at least approximately in ASCII, as well. It seems reasonable then to
2391 represent ASCII in one byte, and JIS X 0208 in two. And this is exactly
2392 what the Extended Unix Code for Japanese (EUC-JP) does. ASCII is
2393 invoked to the GL register, and JIS X 0208 is invoked to the GR
2394 register. Thus, each byte can be tested for its character set by
2395 looking at the high bit; if set, it is Japanese, if clear, it is ASCII.
2396 Furthermore, since control characters like newline can never be part of
2397 a graphic character, even in the case of corruption in transmission the
2398 stream will be resynchronized at every line break, on the order of 60-80
2399 bytes. This coding system requires no escape sequences or special
2400 control codes to represent 99.9% of all Japanese text.
2402 Note carefully the distinction between the character sets (ASCII and
2403 JIS X 0208), the encoding (EUC-JP), and the coding system (ISO 2022).
2404 The JIS X 0208 character set is used in three different encodings for
2405 Japanese, but in ISO-2022-JP it is invoked into GL (so the high bit is
2406 always clear), in EUC-JP it is invoked into GR (setting the high bit in
2407 the process), and in Shift JIS the high bit may be set or reset, and the
2408 significant bits are shifted within the 16-bit character so that the two
2409 main character sets can coexist with a third (the "halfwidth katakana"
2410 of JIS X 0201). As the name implies, the ISO-2022-JP encoding is also a
2411 version of the ISO-2022 coding system.
2413 In order to systematically treat subsidiary character sets (like the
2414 "halfwidth katakana" already mentioned, and the "supplementary kanji" of
2415 JIS X 0212), four further registers are defined: G0, G1, G2, and G3.
2416 Unlike GL and GR, they are not logically distinguished by internal
2417 format. Instead, the process of "invocation" mentioned earlier is
2418 broken into two steps: first, a character set is "designated" to one of
2419 the registers G0-G3 by use of an "escape sequence" of the form:
2423 where I is an intermediate character or characters in the range 0x20
2424 - 0x3F, and F, from the range 0x30-0x7Fm is the final character
2425 identifying this charset. (Final characters in the range 0x30-0x3F are
2426 reserved for private use and will never have a publicly registered
2429 Then that register is "invoked" to either GL or GR, either
2430 automatically (designations to G0 normally involve invocation to GL as
2431 well), or by use of shifting (affecting only the following character in
2432 the data stream) or locking (effective until the next designation or
2433 locking) control sequences. An encoding conformant to ISO 2022 is
2434 typically defined by designating the initial contents of the G0-G3
2435 registers, specifying a 7 or 8 bit environment, and specifying whether
2436 further designations will be recognized.
2438 Some examples of character sets and the registered final characters
2439 F used to designate them:
2442 ASCII (B), left (J) and right (I) half of JIS X 0201, ...
2445 Latin-1 (A), Latin-2 (B), Latin-3 (C), ...
2448 GB2312 (A), JIS X 0208 (B), KSC5601 (C), ...
2453 The meanings of the various characters in these sequences, where not
2454 specified by the ISO 2022 standard (such as the ESC character), are
2455 assigned by "ECMA", the European Computer Manufacturers Association.
2457 The meaning of intermediate characters are:
2459 $ [0x24]: indicate charset of dimension 2 (94x94 or 96x96).
2460 ( [0x28]: designate to G0 a 94-charset whose final byte is F.
2461 ) [0x29]: designate to G1 a 94-charset whose final byte is F.
2462 * [0x2A]: designate to G2 a 94-charset whose final byte is F.
2463 + [0x2B]: designate to G3 a 94-charset whose final byte is F.
2464 , [0x2C]: designate to G0 a 96-charset whose final byte is F.
2465 - [0x2D]: designate to G1 a 96-charset whose final byte is F.
2466 . [0x2E]: designate to G2 a 96-charset whose final byte is F.
2467 / [0x2F]: designate to G3 a 96-charset whose final byte is F.
2469 The comma may be used in files read and written only by MULE, as a
2470 MULE extension, but this is illegal in ISO 2022. (The reason is that
2471 in ISO 2022 G0 must be a 94-member character set, with 0x20 assigned
2472 the value SPACE, and 0x7F assigned the value DEL.)
2474 Here are examples of designations:
2476 ESC ( B : designate to G0 ASCII
2477 ESC - A : designate to G1 Latin-1
2478 ESC $ ( A or ESC $ A : designate to G0 GB2312
2479 ESC $ ( B or ESC $ B : designate to G0 JISX0208
2480 ESC $ ) C : designate to G1 KSC5601
2482 (The short forms used to designate GB2312 and JIS X 0208 are for
2483 backwards compatibility; the long forms are preferred.)
2485 To use a charset designated to G2 or G3, and to use a charset
2486 designated to G1 in a 7-bit environment, you must explicitly invoke G1,
2487 G2, or G3 into GL. There are two types of invocation, Locking Shift
2488 (forever) and Single Shift (one character only).
2490 Locking Shift is done as follows:
2492 LS0 or SI (0x0F): invoke G0 into GL
2493 LS1 or SO (0x0E): invoke G1 into GL
2494 LS2: invoke G2 into GL
2495 LS3: invoke G3 into GL
2496 LS1R: invoke G1 into GR
2497 LS2R: invoke G2 into GR
2498 LS3R: invoke G3 into GR
2500 Single Shift is done as follows:
2502 SS2 or ESC N: invoke G2 into GL
2503 SS3 or ESC O: invoke G3 into GL
2505 The shift functions (such as LS1R and SS3) are represented by control
2506 characters (from C1) in 8 bit environments and by escape sequences in 7
2509 (#### Ben says: I think the above is slightly incorrect. It appears
2510 that SS2 invokes G2 into GR and SS3 invokes G3 into GR, whereas ESC N
2511 and ESC O behave as indicated. The above definitions will not parse
2512 EUC-encoded text correctly, and it looks like the code in mule-coding.c
2513 has similar problems.)
2515 Evidently there are a lot of ISO-2022-compliant ways of encoding
2516 multilingual text. Now, in the world, there exist many coding systems
2517 such as X11's Compound Text, Japanese JUNET code, and so-called EUC
2518 (Extended UNIX Code); all of these are variants of ISO 2022.
2520 In MULE, we characterize a version of ISO 2022 by the following
2523 1. The character sets initially designated to G0 thru G3.
2525 2. Whether short form designations are allowed for Japanese and
2528 3. Whether ASCII should be designated to G0 before control characters.
2530 4. Whether ASCII should be designated to G0 at the end of line.
2532 5. 7-bit environment or 8-bit environment.
2534 6. Whether Locking Shifts are used or not.
2536 7. Whether to use ASCII or the variant JIS X 0201-1976-Roman.
2538 8. Whether to use JIS X 0208-1983 or the older version JIS X
2541 (The last two are only for Japanese.)
2543 By specifying these attributes, you can create any variant of ISO
2546 Here are several examples:
2548 ISO-2022-JP -- Coding system used in Japanese email (RFC 1463 #### check).
2549 1. G0 <- ASCII, G1..3 <- never used
2553 5. 7-bit environment
2556 8. Use JIS X 0208-1983
2558 ctext -- X11 Compound Text
2559 1. G0 <- ASCII, G1 <- Latin-1, G2,3 <- never used.
2563 5. 8-bit environment.
2566 8. Use JIS X 0208-1983.
2568 euc-china -- Chinese EUC. Often called the "GB encoding", but that is
2569 technically incorrect.
2570 1. G0 <- ASCII, G1 <- GB 2312, G2,3 <- never used.
2574 5. 8-bit environment.
2577 8. Use JIS X 0208-1983.
2579 ISO-2022-KR -- Coding system used in Korean email.
2580 1. G0 <- ASCII, G1 <- KSC 5601, G2,3 <- never used.
2584 5. 7-bit environment.
2587 8. Use JIS X 0208-1983.
2589 MULE creates all of these coding systems by default.
2592 File: lispref.info, Node: EOL Conversion, Next: Coding System Properties, Prev: ISO 2022, Up: Coding Systems
2598 Automatically detect the end-of-line type (LF, CRLF, or CR). Also
2599 generate subsidiary coding systems named `NAME-unix', `NAME-dos',
2600 and `NAME-mac', that are identical to this coding system but have
2601 an EOL-TYPE value of `lf', `crlf', and `cr', respectively.
2604 The end of a line is marked externally using ASCII LF. Since this
2605 is also the way that XEmacs represents an end-of-line internally,
2606 specifying this option results in no end-of-line conversion. This
2607 is the standard format for Unix text files.
2610 The end of a line is marked externally using ASCII CRLF. This is
2611 the standard format for MS-DOS text files.
2614 The end of a line is marked externally using ASCII CR. This is the
2615 standard format for Macintosh text files.
2618 Automatically detect the end-of-line type but do not generate
2619 subsidiary coding systems. (This value is converted to `nil' when
2620 stored internally, and `coding-system-property' will return `nil'.)
2623 File: lispref.info, Node: Coding System Properties, Next: Basic Coding System Functions, Prev: EOL Conversion, Up: Coding Systems
2625 Coding System Properties
2626 ------------------------
2629 String to be displayed in the modeline when this coding system is
2633 End-of-line conversion to be used. It should be one of the types
2634 listed in *Note EOL Conversion::.
2637 The coding system which is the same as this one, except that it
2638 uses the Unix line-breaking convention.
2641 The coding system which is the same as this one, except that it
2642 uses the DOS line-breaking convention.
2645 The coding system which is the same as this one, except that it
2646 uses the Macintosh line-breaking convention.
2648 `post-read-conversion'
2649 Function called after a file has been read in, to perform the
2650 decoding. Called with two arguments, START and END, denoting a
2651 region of the current buffer to be decoded.
2653 `pre-write-conversion'
2654 Function called before a file is written out, to perform the
2655 encoding. Called with two arguments, START and END, denoting a
2656 region of the current buffer to be encoded.
2658 The following additional properties are recognized if TYPE is
2665 The character set initially designated to the G0 - G3 registers.
2666 The value should be one of
2668 * A charset object (designate that character set)
2670 * `nil' (do not ever use this register)
2672 * `t' (no character set is initially designated to the
2673 register, but may be later on; this automatically sets the
2674 corresponding `force-g*-on-output' property)
2676 `force-g0-on-output'
2677 `force-g1-on-output'
2678 `force-g2-on-output'
2679 `force-g3-on-output'
2680 If non-`nil', send an explicit designation sequence on output
2681 before using the specified register.
2684 If non-`nil', use the short forms `ESC $ @', `ESC $ A', and `ESC $
2685 B' on output in place of the full designation sequences `ESC $ (
2686 @', `ESC $ ( A', and `ESC $ ( B'.
2689 If non-`nil', don't designate ASCII to G0 at each end of line on
2690 output. Setting this to non-`nil' also suppresses other
2691 state-resetting that normally happens at the end of a line.
2694 If non-`nil', don't designate ASCII to G0 before control chars on
2698 If non-`nil', use 7-bit environment on output. Otherwise, use
2702 If non-`nil', use locking-shift (SO/SI) instead of single-shift or
2703 designation by escape sequence.
2706 If non-`nil', don't use ISO6429's direction specification.
2709 If non-`nil', literal control characters that are the same as the
2710 beginning of a recognized ISO 2022 or ISO 6429 escape sequence (in
2711 particular, ESC (0x1B), SO (0x0E), SI (0x0F), SS2 (0x8E), SS3
2712 (0x8F), and CSI (0x9B)) are "quoted" with an escape character so
2713 that they can be properly distinguished from an escape sequence.
2714 (Note that doing this results in a non-portable encoding.) This
2715 encoding flag is used for byte-compiled files. Note that ESC is a
2716 good choice for a quoting character because there are no escape
2717 sequences whose second byte is a character from the Control-0 or
2718 Control-1 character sets; this is explicitly disallowed by the ISO
2721 `input-charset-conversion'
2722 A list of conversion specifications, specifying conversion of
2723 characters in one charset to another when decoding is performed.
2724 Each specification is a list of two elements: the source charset,
2725 and the destination charset.
2727 `output-charset-conversion'
2728 A list of conversion specifications, specifying conversion of
2729 characters in one charset to another when encoding is performed.
2730 The form of each specification is the same as for
2731 `input-charset-conversion'.
2733 The following additional properties are recognized (and required) if
2737 CCL program used for decoding (converting to internal format).
2740 CCL program used for encoding (converting to external format).
2742 The following properties are used internally: EOL-CR, EOL-CRLF,
2746 File: lispref.info, Node: Basic Coding System Functions, Next: Coding System Property Functions, Prev: Coding System Properties, Up: Coding Systems
2748 Basic Coding System Functions
2749 -----------------------------
2751 - Function: find-coding-system coding-system-or-name
2752 This function retrieves the coding system of the given name.
2754 If CODING-SYSTEM-OR-NAME is a coding-system object, it is simply
2755 returned. Otherwise, CODING-SYSTEM-OR-NAME should be a symbol.
2756 If there is no such coding system, `nil' is returned. Otherwise
2757 the associated coding system object is returned.
2759 - Function: get-coding-system name
2760 This function retrieves the coding system of the given name. Same
2761 as `find-coding-system' except an error is signalled if there is no
2762 such coding system instead of returning `nil'.
2764 - Function: coding-system-list
2765 This function returns a list of the names of all defined coding
2768 - Function: coding-system-name coding-system
2769 This function returns the name of the given coding system.
2771 - Function: coding-system-base coding-system
2772 Returns the base coding system (undecided EOL convention) coding
2775 - Function: make-coding-system name type &optional doc-string props
2776 This function registers symbol NAME as a coding system.
2778 TYPE describes the conversion method used and should be one of the
2779 types listed in *Note Coding System Types::.
2781 DOC-STRING is a string describing the coding system.
2783 PROPS is a property list, describing the specific nature of the
2784 character set. Recognized properties are as in *Note Coding
2785 System Properties::.
2787 - Function: copy-coding-system old-coding-system new-name
2788 This function copies OLD-CODING-SYSTEM to NEW-NAME. If NEW-NAME
2789 does not name an existing coding system, a new one will be created.
2791 - Function: subsidiary-coding-system coding-system eol-type
2792 This function returns the subsidiary coding system of
2793 CODING-SYSTEM with eol type EOL-TYPE.
2796 File: lispref.info, Node: Coding System Property Functions, Next: Encoding and Decoding Text, Prev: Basic Coding System Functions, Up: Coding Systems
2798 Coding System Property Functions
2799 --------------------------------
2801 - Function: coding-system-doc-string coding-system
2802 This function returns the doc string for CODING-SYSTEM.
2804 - Function: coding-system-type coding-system
2805 This function returns the type of CODING-SYSTEM.
2807 - Function: coding-system-property coding-system prop
2808 This function returns the PROP property of CODING-SYSTEM.
2811 File: lispref.info, Node: Encoding and Decoding Text, Next: Detection of Textual Encoding, Prev: Coding System Property Functions, Up: Coding Systems
2813 Encoding and Decoding Text
2814 --------------------------
2816 - Function: decode-coding-region start end coding-system &optional
2818 This function decodes the text between START and END which is
2819 encoded in CODING-SYSTEM. This is useful if you've read in
2820 encoded text from a file without decoding it (e.g. you read in a
2821 JIS-formatted file but used the `binary' or `no-conversion' coding
2822 system, so that it shows up as `^[$B!<!+^[(B'). The length of the
2823 encoded text is returned. BUFFER defaults to the current buffer
2826 - Function: encode-coding-region start end coding-system &optional
2828 This function encodes the text between START and END using
2829 CODING-SYSTEM. This will, for example, convert Japanese
2830 characters into stuff such as `^[$B!<!+^[(B' if you use the JIS
2831 encoding. The length of the encoded text is returned. BUFFER
2832 defaults to the current buffer if unspecified.
2835 File: lispref.info, Node: Detection of Textual Encoding, Next: Big5 and Shift-JIS Functions, Prev: Encoding and Decoding Text, Up: Coding Systems
2837 Detection of Textual Encoding
2838 -----------------------------
2840 - Function: coding-category-list
2841 This function returns a list of all recognized coding categories.
2843 - Function: set-coding-priority-list list
2844 This function changes the priority order of the coding categories.
2845 LIST should be a list of coding categories, in descending order of
2846 priority. Unspecified coding categories will be lower in priority
2847 than all specified ones, in the same relative order they were in
2850 - Function: coding-priority-list
2851 This function returns a list of coding categories in descending
2854 - Function: set-coding-category-system coding-category coding-system
2855 This function changes the coding system associated with a coding
2858 - Function: coding-category-system coding-category
2859 This function returns the coding system associated with a coding
2862 - Function: detect-coding-region start end &optional buffer
2863 This function detects coding system of the text in the region
2864 between START and END. Returned value is a list of possible coding
2865 systems ordered by priority. If only ASCII characters are found,
2866 it returns `autodetect' or one of its subsidiary coding systems
2867 according to a detected end-of-line type. Optional arg BUFFER
2868 defaults to the current buffer.
2871 File: lispref.info, Node: Big5 and Shift-JIS Functions, Next: Predefined Coding Systems, Prev: Detection of Textual Encoding, Up: Coding Systems
2873 Big5 and Shift-JIS Functions
2874 ----------------------------
2876 These are special functions for working with the non-standard Shift-JIS
2879 - Function: decode-shift-jis-char code
2880 This function decodes a JIS X 0208 character of Shift-JIS
2881 coding-system. CODE is the character code in Shift-JIS as a cons
2882 of type bytes. The corresponding character is returned.
2884 - Function: encode-shift-jis-char character
2885 This function encodes a JIS X 0208 character CHARACTER to
2886 SHIFT-JIS coding-system. The corresponding character code in
2887 SHIFT-JIS is returned as a cons of two bytes.
2889 - Function: decode-big5-char code
2890 This function decodes a Big5 character CODE of BIG5 coding-system.
2891 CODE is the character code in BIG5. The corresponding character
2894 - Function: encode-big5-char character
2895 This function encodes the Big5 character CHARACTER to BIG5
2896 coding-system. The corresponding character code in Big5 is
2900 File: lispref.info, Node: Predefined Coding Systems, Prev: Big5 and Shift-JIS Functions, Up: Coding Systems
2902 Coding Systems Implemented
2903 --------------------------
2905 MULE initializes most of the commonly used coding systems at XEmacs's
2906 startup. A few others are initialized only when the relevant language
2907 environment is selected and support libraries are loaded. (NB: The
2908 following list is based on XEmacs 21.2.19, the development branch at the
2909 time of writing. The list may be somewhat different for other
2910 versions. Recent versions of GNU Emacs 20 implement a few more rare
2911 coding systems; work is being done to port these to XEmacs.)
2913 Unfortunately, there is not a consistent naming convention for
2914 character sets, and for practical purposes coding systems often take
2915 their name from their principal character sets (ASCII, KOI8-R, Shift
2916 JIS). Others take their names from the coding system (ISO-2022-JP,
2917 EUC-KR), and a few from their non-text usages (internal, binary). To
2918 provide for this, and for the fact that many coding systems have
2919 several common names, an aliasing system is provided. Finally, some
2920 effort has been made to use names that are registered as MIME charsets
2921 (this is why the name 'shift_jis contains that un-Lisp-y underscore).
2923 There is a systematic naming convention regarding end-of-line (EOL)
2924 conventions for different systems. A coding system whose name ends in
2925 "-unix" forces the assumptions that lines are broken by newlines (0x0A).
2926 A coding system whose name ends in "-mac" forces the assumptions that
2927 lines are broken by ASCII CRs (0x0D). A coding system whose name ends
2928 in "-dos" forces the assumptions that lines are broken by CRLF sequences
2929 (0x0D 0x0A). These subsidiary coding systems are automatically derived
2930 from a base coding system. Use of the base coding system implies
2931 autodetection of the text file convention. (The fact that the -unix,
2932 -mac, and -dos are derived from a base system results in them showing up
2933 as "aliases" in `list-coding-systems'.) These subsidiaries have a
2934 consistent modeline indicator as well. "-dos" coding systems have ":T"
2935 appended to their modeline indicator, while "-mac" coding systems have
2936 ":t" appended (eg, "ISO8:t" for iso-2022-8-mac).
2938 In the following table, each coding system is given with its mode
2939 line indicator in parentheses. Non-textual coding systems are listed
2940 first, followed by textual coding systems and their aliases. (The
2941 coding system subsidiary modeline indicators ":T" and ":t" will be
2942 omitted from the table of coding systems.)
2944 ### SJT 1999-08-23 Maybe should order these by language? Definitely
2945 need language usage for the ISO-8859 family.
2947 Note that although true coding system aliases have been implemented
2948 for XEmacs 21.2, the coding system initialization has not yet been
2949 converted as of 21.2.19. So coding systems described as aliases have
2950 the same properties as the aliased coding system, but will not be equal
2953 `automatic-conversion'
2958 Modeline indicator: `Auto'. A type `undecided' coding system.
2959 Attempts to determine an appropriate coding system from file
2960 contents or the environment.
2969 `no-conversion-unix'
2970 Modeline indicator: `Raw'. A type `no-conversion' coding system,
2971 which converts only line-break-codes. An implementation quirk
2972 means that this coding system is also used for ISO8859-1.
2975 Modeline indicator: `Binary'. A type `no-conversion' coding
2976 system which does no character coding or EOL conversions. An
2977 alias for `raw-text-unix'.
2982 `alternativnyj-unix'
2983 Modeline indicator: `Cy.Alt'. A type `ccl' coding system used for
2984 Alternativnyj, an encoding of the Cyrillic alphabet.
2990 Modeline indicator: `Zh/Big5'. A type `big5' coding system used
2991 for BIG5, the most common encoding of traditional Chinese as used
2998 Modeline indicator: `Zh-GB/EUC'. A type `iso2022' coding system
2999 used for simplified Chinese (as used in the People's Republic of
3000 China), with the `ascii' (G0), `chinese-gb2312' (G1), and `sisheng'
3001 (G2) character sets initially designated. Chinese EUC (Extended
3008 Modeline indicator: `CText/Hbrw'. A type `iso2022' coding system
3009 with the `ascii' (G0) and `hebrew-iso8859-8' (G1) character sets
3010 initially designated for Hebrew.
3016 Modeline indicator: `CText'. A type `iso2022' 8-bit coding system
3017 with the `ascii' (G0) and `latin-iso8859-1' (G1) character sets
3018 initially designated. X11 Compound Text Encoding. Often
3019 mistakenly recognized instead of EUC encodings; usual cause is
3020 inappropriate setting of `coding-priority-list'.
3023 Modeline indicator: `ESC/Quot'. A type `iso2022' 8-bit coding
3024 system with the `ascii' (G0) and `latin-iso8859-1' (G1) character
3025 sets initially designated and escape quoting. Unix EOL conversion
3026 (ie, no conversion). It is used for .ELC files.
3032 Modeline indicator: `Ja/EUC'. A type `iso2022' 8-bit coding system
3033 with `ascii' (G0), `japanese-jisx0208' (G1), `katakana-jisx0201'
3034 (G2), and `japanese-jisx0212' (G3) initially designated. Japanese
3035 EUC (Extended Unix Code).
3041 Modeline indicator: `ko/EUC'. A type `iso2022' 8-bit coding system
3042 with `ascii' (G0) and `korean-ksc5601' (G1) initially designated.
3043 Korean EUC (Extended Unix Code).
3046 Modeline indicator: `Zh-GB/Hz'. A type `no-conversion' coding
3047 system with Unix EOL convention (ie, no conversion) using
3048 post-read-decode and pre-write-encode functions to translate the
3049 Hz/ZW coding system used for Chinese.
3052 `iso-2022-7bit-unix'
3056 Modeline indicator: `ISO7'. A type `iso2022' 7-bit coding system
3057 with `ascii' (G0) initially designated. Other character sets must
3058 be explicitly designated to be used.
3061 `iso-2022-7bit-ss2-dos'
3062 `iso-2022-7bit-ss2-mac'
3063 `iso-2022-7bit-ss2-unix'
3064 Modeline indicator: `ISO7/SS'. A type `iso2022' 7-bit coding
3065 system with `ascii' (G0) initially designated. Other character
3066 sets must be explicitly designated to be used. SS2 is used to
3067 invoke a 96-charset, one character at a time.
3073 Modeline indicator: `ISO8'. A type `iso2022' 8-bit coding system
3074 with `ascii' (G0) and `latin-iso8859-1' (G1) initially designated.
3075 Other character sets must be explicitly designated to be used.
3076 No single-shift or locking-shift.
3079 `iso-2022-8bit-ss2-dos'
3080 `iso-2022-8bit-ss2-mac'
3081 `iso-2022-8bit-ss2-unix'
3082 Modeline indicator: `ISO8/SS'. A type `iso2022' 8-bit coding
3083 system with `ascii' (G0) and `latin-iso8859-1' (G1) initially
3084 designated. Other character sets must be explicitly designated to
3085 be used. SS2 is used to invoke a 96-charset, one character at a
3089 `iso-2022-int-1-dos'
3090 `iso-2022-int-1-mac'
3091 `iso-2022-int-1-unix'
3092 Modeline indicator: `INT-1'. A type `iso2022' 7-bit coding system
3093 with `ascii' (G0) and `korean-ksc5601' (G1) initially designated.
3096 `iso-2022-jp-1978-irv'
3097 `iso-2022-jp-1978-irv-dos'
3098 `iso-2022-jp-1978-irv-mac'
3099 `iso-2022-jp-1978-irv-unix'
3100 Modeline indicator: `Ja-78/7bit'. A type `iso2022' 7-bit coding
3101 system. For compatibility with old Japanese terminals; if you
3102 need to know, look at the source.
3105 `iso-2022-jp-2 (ISO7/SS)'
3111 `iso-2022-jp-2-unix'
3112 Modeline indicator: `MULE/7bit'. A type `iso2022' 7-bit coding
3113 system with `ascii' (G0) initially designated, and complex
3114 specifications to insure backward compatibility with old Japanese
3115 systems. Used for communication with mail and news in Japan. The
3116 "-2" versions also use SS2 to invoke a 96-charset one character at
3120 Modeline indicator: `Ko/7bit' A type `iso2022' 7-bit coding
3121 system with `ascii' (G0) and `korean-ksc5601' (G1) initially
3122 designated. Used for e-mail in Korea.
3127 `iso-2022-lock-unix'
3128 Modeline indicator: `ISO7/Lock'. A type `iso2022' 7-bit coding
3129 system with `ascii' (G0) initially designated, using Locking-Shift
3130 to invoke a 96-charset.
3136 Due to implementation, this is not a type `iso2022' coding system,
3137 but rather an alias for the `raw-text' coding system.
3143 Modeline indicator: `MIME/Ltn-2'. A type `iso2022' coding system
3144 with `ascii' (G0) and `latin-iso8859-2' (G1) initially invoked.
3150 Modeline indicator: `MIME/Ltn-3'. A type `iso2022' coding system
3151 with `ascii' (G0) and `latin-iso8859-3' (G1) initially invoked.
3157 Modeline indicator: `MIME/Ltn-4'. A type `iso2022' coding system
3158 with `ascii' (G0) and `latin-iso8859-4' (G1) initially invoked.
3164 Modeline indicator: `ISO8/Cyr'. A type `iso2022' coding system
3165 with `ascii' (G0) and `cyrillic-iso8859-5' (G1) initially invoked.
3171 Modeline indicator: `Grk'. A type `iso2022' coding system with
3172 `ascii' (G0) and `greek-iso8859-7' (G1) initially invoked.
3178 Modeline indicator: `MIME/Hbrw'. A type `iso2022' coding system
3179 with `ascii' (G0) and `hebrew-iso8859-8' (G1) initially invoked.
3185 Modeline indicator: `MIME/Ltn-5'. A type `iso2022' coding system
3186 with `ascii' (G0) and `latin-iso8859-9' (G1) initially invoked.
3192 Modeline indicator: `KOI8'. A type `ccl' coding-system used for
3193 KOI8-R, an encoding of the Cyrillic alphabet.
3199 Modeline indicator: `Ja/SJIS'. A type `shift-jis' coding-system
3200 implementing the Shift-JIS encoding for Japanese. The underscore
3201 is to conform to the MIME charset implementing this encoding.
3207 Modeline indicator: `TIS620'. A type `ccl' encoding for Thai. The
3208 external encoding is defined by TIS620, the internal encoding is
3209 peculiar to MULE, and called `thai-xtis'.
3212 Modeline indicator: `VIQR'. A type `no-conversion' coding system
3213 with Unix EOL convention (ie, no conversion) using
3214 post-read-decode and pre-write-encode functions to translate the
3215 VIQR coding system for Vietnamese.
3221 Modeline indicator: `VISCII'. A type `ccl' coding-system used for
3222 VISCII 1.1 for Vietnamese. Differs slightly from VSCII; VISCII is
3223 given priority by XEmacs.
3229 Modeline indicator: `VSCII'. A type `ccl' coding-system used for
3230 VSCII 1.1 for Vietnamese. Differs slightly from VISCII, which is
3231 given priority by XEmacs. Use `(prefer-coding-system
3232 'vietnamese-vscii)' to give priority to VSCII.
3236 File: lispref.info, Node: CCL, Next: Category Tables, Prev: Coding Systems, Up: MULE
3241 CCL (Code Conversion Language) is a simple structured programming
3242 language designed for character coding conversions. A CCL program is
3243 compiled to CCL code (represented by a vector of integers) and executed
3244 by the CCL interpreter embedded in Emacs. The CCL interpreter
3245 implements a virtual machine with 8 registers called `r0', ..., `r7', a
3246 number of control structures, and some I/O operators. Take care when
3247 using registers `r0' (used in implicit "set" statements) and especially
3248 `r7' (used internally by several statements and operations, especially
3249 for multiple return values and I/O operations).
3251 CCL is used for code conversion during process I/O and file I/O for
3252 non-ISO2022 coding systems. (It is the only way for a user to specify a
3253 code conversion function.) It is also used for calculating the code
3254 point of an X11 font from a character code. However, since CCL is
3255 designed as a powerful programming language, it can be used for more
3256 generic calculation where efficiency is demanded. A combination of
3257 three or more arithmetic operations can be calculated faster by CCL than
3260 *Warning:* The code in `src/mule-ccl.c' and
3261 `$packages/lisp/mule-base/mule-ccl.el' is the definitive description of
3262 CCL's semantics. The previous version of this section contained
3263 several typos and obsolete names left from earlier versions of MULE,
3264 and many may remain. (I am not an experienced CCL programmer; the few
3265 who know CCL well find writing English painful.)
3267 A CCL program transforms an input data stream into an output data
3268 stream. The input stream, held in a buffer of constant bytes, is left
3269 unchanged. The buffer may be filled by an external input operation,
3270 taken from an Emacs buffer, or taken from a Lisp string. The output
3271 buffer is a dynamic array of bytes, which can be written by an external
3272 output operation, inserted into an Emacs buffer, or returned as a Lisp
3275 A CCL program is a (Lisp) list containing two or three members. The
3276 first member is the "buffer magnification", which indicates the
3277 required minimum size of the output buffer as a multiple of the input
3278 buffer. It is followed by the "main block" which executes while there
3279 is input remaining, and an optional "EOF block" which is executed when
3280 the input is exhausted. Both the main block and the EOF block are CCL
3283 A "CCL block" is either a CCL statement or list of CCL statements.
3284 A "CCL statement" is either a "set statement" (either an integer or an
3285 "assignment", which is a list of a register to receive the assignment,
3286 an assignment operator, and an expression) or a "control statement" (a
3287 list starting with a keyword, whose allowable syntax depends on the
3292 * CCL Syntax:: CCL program syntax in BNF notation.
3293 * CCL Statements:: Semantics of CCL statements.
3294 * CCL Expressions:: Operators and expressions in CCL.
3295 * Calling CCL:: Running CCL programs.
3296 * CCL Examples:: The encoding functions for Big5 and KOI-8.
3299 File: lispref.info, Node: CCL Syntax, Next: CCL Statements, Up: CCL
3304 The full syntax of a CCL program in BNF notation:
3307 (BUFFER_MAGNIFICATION
3311 BUFFER_MAGNIFICATION := integer
3312 CCL_MAIN_BLOCK := CCL_BLOCK
3313 CCL_EOF_BLOCK := CCL_BLOCK
3316 STATEMENT | (STATEMENT [STATEMENT ...])
3318 SET | IF | BRANCH | LOOP | REPEAT | BREAK | READ | WRITE
3323 | (REG ASSIGNMENT_OPERATOR EXPRESSION)
3326 EXPRESSION := ARG | (EXPRESSION OPERATOR ARG)
3328 IF := (if EXPRESSION CCL_BLOCK [CCL_BLOCK])
3329 BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
3330 LOOP := (loop STATEMENT [STATEMENT ...])
3334 | (write-repeat [REG | integer | string])
3335 | (write-read-repeat REG [integer | ARRAY])
3338 | (read-if (REG OPERATOR ARG) CCL_BLOCK CCL_BLOCK)
3339 | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
3342 | (write EXPRESSION)
3343 | (write integer) | (write string) | (write REG ARRAY)
3345 CALL := (call ccl-program-name)
3348 REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
3349 ARG := REG | integer
3351 + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
3352 | < | > | == | <= | >= | != | de-sjis | en-sjis
3353 ASSIGNMENT_OPERATOR :=
3354 += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
3355 ARRAY := '[' integer ... ']'
3358 File: lispref.info, Node: CCL Statements, Next: CCL Expressions, Prev: CCL Syntax, Up: CCL
3363 The Emacs Code Conversion Language provides the following statement
3364 types: "set", "if", "branch", "loop", "repeat", "break", "read",
3365 "write", "call", and "end".
3370 The "set" statement has three variants with the syntaxes `(REG =
3371 EXPRESSION)', `(REG ASSIGNMENT_OPERATOR EXPRESSION)', and `INTEGER'.
3372 The assignment operator variation of the "set" statement works the same
3373 way as the corresponding C expression statement does. The assignment
3374 operators are `+=', `-=', `*=', `/=', `%=', `&=', `|=', `^=', `<<=',
3375 and `>>=', and they have the same meanings as in C. A "naked integer"
3376 INTEGER is equivalent to a SET statement of the form `(r0 = INTEGER)'.
3381 The "read" statement takes one or more registers as arguments. It
3382 reads one byte (a C char) from the input into each register in turn.
3384 The "write" takes several forms. In the form `(write REG ...)' it
3385 takes one or more registers as arguments and writes each in turn to the
3386 output. The integer in a register (interpreted as an Emchar) is
3387 encoded to multibyte form (ie, Bufbytes) and written to the current
3388 output buffer. If it is less than 256, it is written as is. The forms
3389 `(write EXPRESSION)' and `(write INTEGER)' are treated analogously.
3390 The form `(write STRING)' writes the constant string to the output. A
3391 "naked string" `STRING' is equivalent to the statement `(write
3392 STRING)'. The form `(write REG ARRAY)' writes the REGth element of the
3393 ARRAY to the output.
3395 Conditional statements:
3396 =======================
3398 The "if" statement takes an EXPRESSION, a CCL BLOCK, and an optional
3399 SECOND CCL BLOCK as arguments. If the EXPRESSION evaluates to
3400 non-zero, the first CCL BLOCK is executed. Otherwise, if there is a
3401 SECOND CCL BLOCK, it is executed.
3403 The "read-if" variant of the "if" statement takes an EXPRESSION, a
3404 CCL BLOCK, and an optional SECOND CCL BLOCK as arguments. The
3405 EXPRESSION must have the form `(REG OPERATOR OPERAND)' (where OPERAND is
3406 a register or an integer). The `read-if' statement first reads from
3407 the input into the first register operand in the EXPRESSION, then
3408 conditionally executes a CCL block just as the `if' statement does.
3410 The "branch" statement takes an EXPRESSION and one or more CCL
3411 blocks as arguments. The CCL blocks are treated as a zero-indexed
3412 array, and the `branch' statement uses the EXPRESSION as the index of
3413 the CCL block to execute. Null CCL blocks may be used as no-ops,
3414 continuing execution with the statement following the `branch'
3415 statement in the containing CCL block. Out-of-range values for the
3416 EXPRESSION are also treated as no-ops.
3418 The "read-branch" variant of the "branch" statement takes an
3419 REGISTER, a CCL BLOCK, and an optional SECOND CCL BLOCK as arguments.
3420 The `read-branch' statement first reads from the input into the
3421 REGISTER, then conditionally executes a CCL block just as the `branch'
3424 Loop control statements:
3425 ========================
3427 The "loop" statement creates a block with an implied jump from the end
3428 of the block back to its head. The loop is exited on a `break'
3429 statement, and continued without executing the tail by a `repeat'
3432 The "break" statement, written `(break)', terminates the current
3433 loop and continues with the next statement in the current block.
3435 The "repeat" statement has three variants, `repeat', `write-repeat',
3436 and `write-read-repeat'. Each continues the current loop from its
3437 head, possibly after performing I/O. `repeat' takes no arguments and
3438 does no I/O before jumping. `write-repeat' takes a single argument (a
3439 register, an integer, or a string), writes it to the output, then jumps.
3440 `write-read-repeat' takes one or two arguments. The first must be a
3441 register. The second may be an integer or an array; if absent, it is
3442 implicitly set to the first (register) argument. `write-read-repeat'
3443 writes its second argument to the output, then reads from the input
3444 into the register, and finally jumps. See the `write' and `read'
3445 statements for the semantics of the I/O operations for each type of
3448 Other control statements:
3449 =========================
3451 The "call" statement, written `(call CCL-PROGRAM-NAME)', executes a CCL
3452 program as a subroutine. It does not return a value to the caller, but
3453 can modify the register status.
3455 The "end" statement, written `(end)', terminates the CCL program
3456 successfully, and returns to caller (which may be a CCL program). It
3457 does not alter the status of the registers.
3460 File: lispref.info, Node: CCL Expressions, Next: Calling CCL, Prev: CCL Statements, Up: CCL
3465 CCL, unlike Lisp, uses infix expressions. The simplest CCL expressions
3466 consist of a single OPERAND, either a register (one of `r0', ..., `r0')
3467 or an integer. Complex expressions are lists of the form `( EXPRESSION
3468 OPERATOR OPERAND )'. Unlike C, assignments are not expressions.
3470 In the following table, X is the target resister for a "set". In
3471 subexpressions, this is implicitly `r7'. This means that `>8', `//',
3472 `de-sjis', and `en-sjis' cannot be used freely in subexpressions, since
3473 they return parts of their values in `r7'. Y may be an expression,
3474 register, or integer, while Z must be a register or an integer.
3476 Name Operator Code C-like Description
3477 CCL_PLUS `+' 0x00 X = Y + Z
3478 CCL_MINUS `-' 0x01 X = Y - Z
3479 CCL_MUL `*' 0x02 X = Y * Z
3480 CCL_DIV `/' 0x03 X = Y / Z
3481 CCL_MOD `%' 0x04 X = Y % Z
3482 CCL_AND `&' 0x05 X = Y & Z
3483 CCL_OR `|' 0x06 X = Y | Z
3484 CCL_XOR `^' 0x07 X = Y ^ Z
3485 CCL_LSH `<<' 0x08 X = Y << Z
3486 CCL_RSH `>>' 0x09 X = Y >> Z
3487 CCL_LSH8 `<8' 0x0A X = (Y << 8) | Z
3488 CCL_RSH8 `>8' 0x0B X = Y >> 8, r[7] = Y & 0xFF
3489 CCL_DIVMOD `//' 0x0C X = Y / Z, r[7] = Y % Z
3490 CCL_LS `<' 0x10 X = (X < Y)
3491 CCL_GT `>' 0x11 X = (X > Y)
3492 CCL_EQ `==' 0x12 X = (X == Y)
3493 CCL_LE `<=' 0x13 X = (X <= Y)
3494 CCL_GE `>=' 0x14 X = (X >= Y)
3495 CCL_NE `!=' 0x15 X = (X != Y)
3496 CCL_ENCODE_SJIS `en-sjis' 0x16 X = HIGHER_BYTE (SJIS (Y, Z))
3497 r[7] = LOWER_BYTE (SJIS (Y, Z)
3498 CCL_DECODE_SJIS `de-sjis' 0x17 X = HIGHER_BYTE (DE-SJIS (Y, Z))
3499 r[7] = LOWER_BYTE (DE-SJIS (Y, Z))
3501 The CCL operators are as in C, with the addition of CCL_LSH8,
3502 CCL_RSH8, CCL_DIVMOD, CCL_ENCODE_SJIS, and CCL_DECODE_SJIS. The
3503 CCL_ENCODE_SJIS and CCL_DECODE_SJIS treat their first and second bytes
3504 as the high and low bytes of a two-byte character code. (SJIS stands
3505 for Shift JIS, an encoding of Japanese characters used by Microsoft.
3506 CCL_ENCODE_SJIS is a complicated transformation of the Japanese
3507 standard JIS encoding to Shift JIS. CCL_DECODE_SJIS is its inverse.)
3508 It is somewhat odd to represent the SJIS operations in infix form.
3511 File: lispref.info, Node: Calling CCL, Next: CCL Examples, Prev: CCL Expressions, Up: CCL
3516 CCL programs are called automatically during Emacs buffer I/O when the
3517 external representation has a coding system type of `shift-jis',
3518 `big5', or `ccl'. The program is specified by the coding system (*note
3519 Coding Systems::). You can also call CCL programs from other CCL
3520 programs, and from Lisp using these functions:
3522 - Function: ccl-execute ccl-program status
3523 Execute CCL-PROGRAM with registers initialized by STATUS.
3524 CCL-PROGRAM is a vector of compiled CCL code created by
3525 `ccl-compile'. It is an error for the program to try to execute a
3526 CCL I/O command. STATUS must be a vector of nine values,
3527 specifying the initial value for the R0, R1 .. R7 registers and
3528 for the instruction counter IC. A `nil' value for a register
3529 initializer causes the register to be set to 0. A `nil' value for
3530 the IC initializer causes execution to start at the beginning of
3531 the program. When the program is done, STATUS is modified (by
3532 side-effect) to contain the ending values for the corresponding
3535 - Function: ccl-execute-on-string ccl-program status string &optional
3537 Execute CCL-PROGRAM with initial STATUS on STRING. CCL-PROGRAM is
3538 a vector of compiled CCL code created by `ccl-compile'. STATUS
3539 must be a vector of nine values, specifying the initial value for
3540 the R0, R1 .. R7 registers and for the instruction counter IC. A
3541 `nil' value for a register initializer causes the register to be
3542 set to 0. A `nil' value for the IC initializer causes execution
3543 to start at the beginning of the program. An optional fourth
3544 argument CONTINUE, if non-`nil', causes the IC to remain on the
3545 unsatisfied read operation if the program terminates due to
3546 exhaustion of the input buffer. Otherwise the IC is set to the end
3547 of the program. When the program is done, STATUS is modified (by
3548 side-effect) to contain the ending values for the corresponding
3549 registers and IC. Returns the resulting string.
3551 To call a CCL program from another CCL program, it must first be
3554 - Function: register-ccl-program name ccl-program
3555 Register NAME for CCL program CCL-PROGRAM in `ccl-program-table'.
3556 CCL-PROGRAM should be the compiled form of a CCL program, or
3557 `nil'. Return index number of the registered CCL program.
3559 Information about the processor time used by the CCL interpreter can
3560 be obtained using these functions:
3562 - Function: ccl-elapsed-time
3563 Returns the elapsed processor time of the CCL interpreter as cons
3564 of user and system time, as floating point numbers measured in
3565 seconds. If only one overall value can be determined, the return
3566 value will be a cons of that value and 0.
3568 - Function: ccl-reset-elapsed-time
3569 Resets the CCL interpreter's internal elapsed time registers.
3572 File: lispref.info, Node: CCL Examples, Prev: Calling CCL, Up: CCL
3577 This section is not yet written.
3580 File: lispref.info, Node: Category Tables, Prev: CCL, Up: MULE
3585 A category table is a type of char table used for keeping track of
3586 categories. Categories are used for classifying characters for use in
3587 regexps--you can refer to a category rather than having to use a
3588 complicated [] expression (and category lookups are significantly
3591 There are 95 different categories available, one for each printable
3592 character (including space) in the ASCII charset. Each category is
3593 designated by one such character, called a "category designator". They
3594 are specified in a regexp using the syntax `\cX', where X is a category
3595 designator. (This is not yet implemented.)
3597 A category table specifies, for each character, the categories that
3598 the character is in. Note that a character can be in more than one
3599 category. More specifically, a category table maps from a character to
3600 either the value `nil' (meaning the character is in no categories) or a
3601 95-element bit vector, specifying for each of the 95 categories whether
3602 the character is in that category.
3604 Special Lisp functions are provided that abstract this, so you do not
3605 have to directly manipulate bit vectors.
3607 - Function: category-table-p object
3608 This function returns `t' if OBJECT is a category table.
3610 - Function: category-table &optional buffer
3611 This function returns the current category table. This is the one
3612 specified by the current buffer, or by BUFFER if it is non-`nil'.
3614 - Function: standard-category-table
3615 This function returns the standard category table. This is the
3616 one used for new buffers.
3618 - Function: copy-category-table &optional category-table
3619 This function returns a new category table which is a copy of
3620 CATEGORY-TABLE, which defaults to the standard category table.
3622 - Function: set-category-table category-table &optional buffer
3623 This function selects CATEGORY-TABLE as the new category table for
3624 BUFFER. BUFFER defaults to the current buffer if omitted.
3626 - Function: category-designator-p object
3627 This function returns `t' if OBJECT is a category designator (a
3628 char in the range `' '' to `'~'').
3630 - Function: category-table-value-p object
3631 This function returns `t' if OBJECT is a category table value.
3632 Valid values are `nil' or a bit vector of size 95.
3635 File: lispref.info, Node: Tips, Next: Building XEmacs and Object Allocation, Prev: MULE, Up: Top
3640 This chapter describes no additional features of XEmacs Lisp. Instead
3641 it gives advice on making effective use of the features described in
3642 the previous chapters.
3646 * Style Tips:: Writing clean and robust programs.
3647 * Compilation Tips:: Making compiled code run fast.
3648 * Documentation Tips:: Writing readable documentation strings.
3649 * Comment Tips:: Conventions for writing comments.
3650 * Library Headers:: Standard headers for library packages.
3653 File: lispref.info, Node: Style Tips, Next: Compilation Tips, Up: Tips
3655 Writing Clean Lisp Programs
3656 ===========================
3658 Here are some tips for avoiding common errors in writing Lisp code
3659 intended for widespread use:
3661 * Since all global variables share the same name space, and all
3662 functions share another name space, you should choose a short word
3663 to distinguish your program from other Lisp programs. Then take
3664 care to begin the names of all global variables, constants, and
3665 functions with the chosen prefix. This helps avoid name conflicts.
3667 This recommendation applies even to names for traditional Lisp
3668 primitives that are not primitives in XEmacs Lisp--even to `cadr'.
3669 Believe it or not, there is more than one plausible way to define
3670 `cadr'. Play it safe; append your name prefix to produce a name
3671 like `foo-cadr' or `mylib-cadr' instead.
3673 If you write a function that you think ought to be added to Emacs
3674 under a certain name, such as `twiddle-files', don't call it by
3675 that name in your program. Call it `mylib-twiddle-files' in your
3676 program, and send mail to `bug-gnu-emacs@prep.ai.mit.edu'
3677 suggesting we add it to Emacs. If and when we do, we can change
3678 the name easily enough.
3680 If one prefix is insufficient, your package may use two or three
3681 alternative common prefixes, so long as they make sense.
3683 Separate the prefix from the rest of the symbol name with a hyphen,
3684 `-'. This will be consistent with XEmacs itself and with most
3685 Emacs Lisp programs.
3687 * It is often useful to put a call to `provide' in each separate
3688 library program, at least if there is more than one entry point to
3691 * If a file requires certain other library programs to be loaded
3692 beforehand, then the comments at the beginning of the file should
3693 say so. Also, use `require' to make sure they are loaded.
3695 * If one file FOO uses a macro defined in another file BAR, FOO
3696 should contain this expression before the first use of the macro:
3698 (eval-when-compile (require 'BAR))
3700 (And BAR should contain `(provide 'BAR)', to make the `require'
3701 work.) This will cause BAR to be loaded when you byte-compile
3702 FOO. Otherwise, you risk compiling FOO without the necessary
3703 macro loaded, and that would produce compiled code that won't work
3704 right. *Note Compiling Macros::.
3706 Using `eval-when-compile' avoids loading BAR when the compiled
3707 version of FOO is _used_.
3709 * If you define a major mode, make sure to run a hook variable using
3710 `run-hooks', just as the existing major modes do. *Note Hooks::.
3712 * If the purpose of a function is to tell you whether a certain
3713 condition is true or false, give the function a name that ends in
3714 `p'. If the name is one word, add just `p'; if the name is
3715 multiple words, add `-p'. Examples are `framep' and
3718 * If a user option variable records a true-or-false condition, give
3719 it a name that ends in `-flag'.
3721 * Please do not define `C-c LETTER' as a key in your major modes.
3722 These sequences are reserved for users; they are the *only*
3723 sequences reserved for users, so we cannot do without them.
3725 Instead, define sequences consisting of `C-c' followed by a
3726 non-letter. These sequences are reserved for major modes.
3728 Changing all the major modes in Emacs 18 so they would follow this
3729 convention was a lot of work. Abandoning this convention would
3730 make that work go to waste, and inconvenience users.
3732 * Sequences consisting of `C-c' followed by `{', `}', `<', `>', `:'
3733 or `;' are also reserved for major modes.
3735 * Sequences consisting of `C-c' followed by any other punctuation
3736 character are allocated for minor modes. Using them in a major
3737 mode is not absolutely prohibited, but if you do that, the major
3738 mode binding may be shadowed from time to time by minor modes.
3740 * You should not bind `C-h' following any prefix character (including
3741 `C-c'). If you don't bind `C-h', it is automatically available as
3742 a help character for listing the subcommands of the prefix
3745 * You should not bind a key sequence ending in <ESC> except following
3746 another <ESC>. (That is, it is ok to bind a sequence ending in
3749 The reason for this rule is that a non-prefix binding for <ESC> in
3750 any context prevents recognition of escape sequences as function
3751 keys in that context.
3753 * Applications should not bind mouse events based on button 1 with
3754 the shift key held down. These events include `S-mouse-1',
3755 `M-S-mouse-1', `C-S-mouse-1', and so on. They are reserved for
3758 * Modes should redefine `mouse-2' as a command to follow some sort of
3759 reference in the text of a buffer, if users usually would not want
3760 to alter the text in that buffer by hand. Modes such as Dired,
3761 Info, Compilation, and Occur redefine it in this way.
3763 * When a package provides a modification of ordinary Emacs behavior,
3764 it is good to include a command to enable and disable the feature,
3765 Provide a command named `WHATEVER-mode' which turns the feature on
3766 or off, and make it autoload (*note Autoload::). Design the
3767 package so that simply loading it has no visible effect--that
3768 should not enable the feature. Users will request the feature by
3769 invoking the command.
3771 * It is a bad idea to define aliases for the Emacs primitives. Use
3772 the standard names instead.
3774 * Redefining an Emacs primitive is an even worse idea. It may do
3775 the right thing for a particular program, but there is no telling
3776 what other programs might break as a result.
3778 * If a file does replace any of the functions or library programs of
3779 standard XEmacs, prominent comments at the beginning of the file
3780 should say which functions are replaced, and how the behavior of
3781 the replacements differs from that of the originals.
3783 * Please keep the names of your XEmacs Lisp source files to 13
3784 characters or less. This way, if the files are compiled, the
3785 compiled files' names will be 14 characters or less, which is
3786 short enough to fit on all kinds of Unix systems.
3788 * Don't use `next-line' or `previous-line' in programs; nearly
3789 always, `forward-line' is more convenient as well as more
3790 predictable and robust. *Note Text Lines::.
3792 * Don't call functions that set the mark, unless setting the mark is
3793 one of the intended features of your program. The mark is a
3794 user-level feature, so it is incorrect to change the mark except
3795 to supply a value for the user's benefit. *Note The Mark::.
3797 In particular, don't use these functions:
3799 * `beginning-of-buffer', `end-of-buffer'
3801 * `replace-string', `replace-regexp'
3803 If you just want to move point, or replace a certain string,
3804 without any of the other features intended for interactive users,
3805 you can replace these functions with one or two lines of simple
3808 * Use lists rather than vectors, except when there is a particular
3809 reason to use a vector. Lisp has more facilities for manipulating
3810 lists than for vectors, and working with lists is usually more
3813 Vectors are advantageous for tables that are substantial in size
3814 and are accessed in random order (not searched front to back),
3815 provided there is no need to insert or delete elements (only lists
3818 * The recommended way to print a message in the echo area is with
3819 the `message' function, not `princ'. *Note The Echo Area::.
3821 * When you encounter an error condition, call the function `error'
3822 (or `signal'). The function `error' does not return. *Note
3825 Do not use `message', `throw', `sleep-for', or `beep' to report
3828 * An error message should start with a capital letter but should not
3831 * Try to avoid using recursive edits. Instead, do what the Rmail `e'
3832 command does: use a new local keymap that contains one command
3833 defined to switch back to the old local keymap. Or do what the
3834 `edit-options' command does: switch to another buffer and let the
3835 user switch back at will. *Note Recursive Editing::.
3837 * In some other systems there is a convention of choosing variable
3838 names that begin and end with `*'. We don't use that convention
3839 in Emacs Lisp, so please don't use it in your programs. (Emacs
3840 uses such names only for program-generated buffers.) The users
3841 will find Emacs more coherent if all libraries use the same
3844 * Use names starting with a space for temporary buffers (*note
3845 Buffer Names::), or at least call `buffer-disable-undo' on them.
3846 Otherwise they may stay referenced by internal undo variable(s)
3847 after getting killed. If this happens before dumping (*note
3848 Building XEmacs::), this may cause fatal error when portable
3851 * Indent each function with `C-M-q' (`indent-sexp') using the
3852 default indentation parameters.
3854 * Don't make a habit of putting close-parentheses on lines by
3855 themselves; Lisp programmers find this disconcerting. Once in a
3856 while, when there is a sequence of many consecutive
3857 close-parentheses, it may make sense to split them in one or two
3860 * Please put a copyright notice on the file if you give copies to
3861 anyone. Use the same lines that appear at the top of the Lisp
3862 files in XEmacs itself. If you have not signed papers to assign
3863 the copyright to the Foundation, then place your name in the
3864 copyright notice in place of the Foundation's name.
3867 File: lispref.info, Node: Compilation Tips, Next: Documentation Tips, Prev: Style Tips, Up: Tips
3869 Tips for Making Compiled Code Fast
3870 ==================================
3872 Here are ways of improving the execution speed of byte-compiled Lisp
3875 * Use the `profile' library to profile your program. See the file
3876 `profile.el' for instructions.
3878 * Use iteration rather than recursion whenever possible. Function
3879 calls are slow in XEmacs Lisp even when a compiled function is
3880 calling another compiled function.
3882 * Using the primitive list-searching functions `memq', `member',
3883 `assq', or `assoc' is even faster than explicit iteration. It may
3884 be worth rearranging a data structure so that one of these
3885 primitive search functions can be used.
3887 * Certain built-in functions are handled specially in byte-compiled
3888 code, avoiding the need for an ordinary function call. It is a
3889 good idea to use these functions rather than alternatives. To see
3890 whether a function is handled specially by the compiler, examine
3891 its `byte-compile' property. If the property is non-`nil', then
3892 the function is handled specially.
3894 For example, the following input will show you that `aref' is
3895 compiled specially (*note Array Functions::) while `elt' is not
3896 (*note Sequence Functions::):
3898 (get 'aref 'byte-compile)
3899 => byte-compile-two-args
3901 (get 'elt 'byte-compile)
3904 * If calling a small function accounts for a substantial part of
3905 your program's running time, make the function inline. This
3906 eliminates the function call overhead. Since making a function
3907 inline reduces the flexibility of changing the program, don't do
3908 it unless it gives a noticeable speedup in something slow enough
3909 that users care about the speed. *Note Inline Functions::.
3912 File: lispref.info, Node: Documentation Tips, Next: Comment Tips, Prev: Compilation Tips, Up: Tips
3914 Tips for Documentation Strings
3915 ==============================
3917 Here are some tips for the writing of documentation strings.
3919 * Every command, function, or variable intended for users to know
3920 about should have a documentation string.
3922 * An internal variable or subroutine of a Lisp program might as well
3923 have a documentation string. In earlier Emacs versions, you could
3924 save space by using a comment instead of a documentation string,
3925 but that is no longer the case.
3927 * The first line of the documentation string should consist of one
3928 or two complete sentences that stand on their own as a summary.
3929 `M-x apropos' displays just the first line, and if it doesn't
3930 stand on its own, the result looks bad. In particular, start the
3931 first line with a capital letter and end with a period.
3933 The documentation string can have additional lines that expand on
3934 the details of how to use the function or variable. The
3935 additional lines should be made up of complete sentences also, but
3936 they may be filled if that looks good.
3938 * For consistency, phrase the verb in the first sentence of a
3939 documentation string as an infinitive with "to" omitted. For
3940 instance, use "Return the cons of A and B." in preference to
3941 "Returns the cons of A and B." Usually it looks good to do
3942 likewise for the rest of the first paragraph. Subsequent
3943 paragraphs usually look better if they have proper subjects.
3945 * Write documentation strings in the active voice, not the passive,
3946 and in the present tense, not the future. For instance, use
3947 "Return a list containing A and B." instead of "A list containing
3948 A and B will be returned."
3950 * Avoid using the word "cause" (or its equivalents) unnecessarily.
3951 Instead of, "Cause Emacs to display text in boldface," write just
3952 "Display text in boldface."
3954 * Do not start or end a documentation string with whitespace.
3956 * Format the documentation string so that it fits in an Emacs window
3957 on an 80-column screen. It is a good idea for most lines to be no
3958 wider than 60 characters. The first line can be wider if
3959 necessary to fit the information that ought to be there.
3961 However, rather than simply filling the entire documentation
3962 string, you can make it much more readable by choosing line breaks
3963 with care. Use blank lines between topics if the documentation
3966 * *Do not* indent subsequent lines of a documentation string so that
3967 the text is lined up in the source code with the text of the first
3968 line. This looks nice in the source code, but looks bizarre when
3969 users view the documentation. Remember that the indentation
3970 before the starting double-quote is not part of the string!
3972 * A variable's documentation string should start with `*' if the
3973 variable is one that users would often want to set interactively.
3974 If the value is a long list, or a function, or if the variable
3975 would be set only in init files, then don't start the
3976 documentation string with `*'. *Note Defining Variables::.
3978 * The documentation string for a variable that is a yes-or-no flag
3979 should start with words such as "Non-nil means...", to make it
3980 clear that all non-`nil' values are equivalent and indicate
3981 explicitly what `nil' and non-`nil' mean.
3983 * When a function's documentation string mentions the value of an
3984 argument of the function, use the argument name in capital letters
3985 as if it were a name for that value. Thus, the documentation
3986 string of the function `/' refers to its second argument as
3987 `DIVISOR', because the actual argument name is `divisor'.
3989 Also use all caps for meta-syntactic variables, such as when you
3990 show the decomposition of a list or vector into subunits, some of
3993 * When a documentation string refers to a Lisp symbol, write it as it
3994 would be printed (which usually means in lower case), with
3995 single-quotes around it. For example: `lambda'. There are two
3996 exceptions: write t and nil without single-quotes. (In this
3997 manual, we normally do use single-quotes for those symbols.)
3999 * Don't write key sequences directly in documentation strings.
4000 Instead, use the `\\[...]' construct to stand for them. For
4001 example, instead of writing `C-f', write `\\[forward-char]'. When
4002 Emacs displays the documentation string, it substitutes whatever
4003 key is currently bound to `forward-char'. (This is normally `C-f',
4004 but it may be some other character if the user has moved key
4005 bindings.) *Note Keys in Documentation::.
4007 * In documentation strings for a major mode, you will want to refer
4008 to the key bindings of that mode's local map, rather than global
4009 ones. Therefore, use the construct `\\<...>' once in the
4010 documentation string to specify which key map to use. Do this
4011 before the first use of `\\[...]'. The text inside the `\\<...>'
4012 should be the name of the variable containing the local keymap for
4015 It is not practical to use `\\[...]' very many times, because
4016 display of the documentation string will become slow. So use this
4017 to describe the most important commands in your major mode, and
4018 then use `\\{...}' to display the rest of the mode's keymap.
4021 File: lispref.info, Node: Comment Tips, Next: Library Headers, Prev: Documentation Tips, Up: Tips
4023 Tips on Writing Comments
4024 ========================
4026 We recommend these conventions for where to put comments and how to
4030 Comments that start with a single semicolon, `;', should all be
4031 aligned to the same column on the right of the source code. Such
4032 comments usually explain how the code on the same line does its
4033 job. In Lisp mode and related modes, the `M-;'
4034 (`indent-for-comment') command automatically inserts such a `;' in
4035 the right place, or aligns such a comment if it is already present.
4037 This and following examples are taken from the Emacs sources.
4039 (setq base-version-list ; there was a base
4040 (assoc (substring fn 0 start-vn) ; version to which
4041 file-version-assoc-list)) ; this looks like
4045 Comments that start with two semicolons, `;;', should be aligned to
4046 the same level of indentation as the code. Such comments usually
4047 describe the purpose of the following lines or the state of the
4048 program at that point. For example:
4050 (prog1 (setq auto-fill-function
4056 Every function that has no documentation string (because it is
4057 used only internally within the package it belongs to), should
4058 have instead a two-semicolon comment right before the function,
4059 explaining what the function does and how to call it properly.
4060 Explain precisely what each argument means and how the function
4061 interprets its possible values.
4064 Comments that start with three semicolons, `;;;', should start at
4065 the left margin. Such comments are used outside function
4066 definitions to make general statements explaining the design
4067 principles of the program. For example:
4069 ;;; This Lisp code is run in XEmacs
4070 ;;; when it is to operate as a server
4071 ;;; for other processes.
4073 Another use for triple-semicolon comments is for commenting out
4074 lines within a function. We use triple-semicolons for this
4075 precisely so that they remain at the left margin.
4078 ;;; This is no longer necessary.
4079 ;;; (force-mode-line-update)
4080 (message "Finished with %s" a))
4083 Comments that start with four semicolons, `;;;;', should be aligned
4084 to the left margin and are used for headings of major sections of a
4085 program. For example:
4089 The indentation commands of the Lisp modes in XEmacs, such as `M-;'
4090 (`indent-for-comment') and <TAB> (`lisp-indent-line') automatically
4091 indent comments according to these conventions, depending on the number
4092 of semicolons. *Note Manipulating Comments: (xemacs)Comments.
4095 File: lispref.info, Node: Library Headers, Prev: Comment Tips, Up: Tips
4097 Conventional Headers for XEmacs Libraries
4098 =========================================
4100 XEmacs has conventions for using special comments in Lisp libraries to
4101 divide them into sections and give information such as who wrote them.
4102 This section explains these conventions. First, an example:
4104 ;;; lisp-mnt.el --- minor mode for Emacs Lisp maintainers
4106 ;; Copyright (C) 1992 Free Software Foundation, Inc.
4108 ;; Author: Eric S. Raymond <esr@snark.thyrsus.com>
4109 ;; Maintainer: Eric S. Raymond <esr@snark.thyrsus.com>
4110 ;; Created: 14 Jul 1992
4114 ;; This file is part of XEmacs.
4115 COPYING PERMISSIONS...
4117 The very first line should have this format:
4119 ;;; FILENAME --- DESCRIPTION
4121 The description should be complete in one line.
4123 After the copyright notice come several "header comment" lines, each
4124 beginning with `;; HEADER-NAME:'. Here is a table of the conventional
4125 possibilities for HEADER-NAME:
4128 This line states the name and net address of at least the principal
4129 author of the library.
4131 If there are multiple authors, you can list them on continuation
4132 lines led by `;;' and a tab character, like this:
4134 ;; Author: Ashwin Ram <Ram-Ashwin@cs.yale.edu>
4135 ;; Dave Sill <de5@ornl.gov>
4136 ;; Dave Brennan <brennan@hal.com>
4137 ;; Eric Raymond <esr@snark.thyrsus.com>
4140 This line should contain a single name/address as in the Author
4141 line, or an address only, or the string `FSF'. If there is no
4142 maintainer line, the person(s) in the Author field are presumed to
4143 be the maintainers. The example above is mildly bogus because the
4144 maintainer line is redundant.
4146 The idea behind the `Author' and `Maintainer' lines is to make
4147 possible a Lisp function to "send mail to the maintainer" without
4148 having to mine the name out by hand.
4150 Be sure to surround the network address with `<...>' if you
4151 include the person's full name as well as the network address.
4154 This optional line gives the original creation date of the file.
4155 For historical interest only.
4158 If you wish to record version numbers for the individual Lisp
4159 program, put them in this line.
4162 In this header line, place the name of the person who adapted the
4163 library for installation (to make it fit the style conventions, for
4167 This line lists keywords for the `finder-by-keyword' help command.
4168 This field is important; it's how people will find your package
4169 when they're looking for things by topic area. To separate the
4170 keywords, you can use spaces, commas, or both.
4172 Just about every Lisp library ought to have the `Author' and
4173 `Keywords' header comment lines. Use the others if they are
4174 appropriate. You can also put in header lines with other header
4175 names--they have no standard meanings, so they can't do any harm.
4177 We use additional stylized comments to subdivide the contents of the
4178 library file. Here is a table of them:
4181 This begins introductory comments that explain how the library
4182 works. It should come right after the copying permissions.
4185 This begins change log information stored in the library file (if
4186 you store the change history there). For most of the Lisp files
4187 distributed with XEmacs, the change history is kept in the file
4188 `ChangeLog' and not in the source file at all; these files do not
4189 have a `;;; Change log:' line.
4192 This begins the actual code of the program.
4194 `;;; FILENAME ends here'
4195 This is the "footer line"; it appears at the very end of the file.
4196 Its purpose is to enable people to detect truncated versions of
4197 the file from the lack of a footer line.
4200 File: lispref.info, Node: Building XEmacs and Object Allocation, Next: Standard Errors, Prev: Tips, Up: Top
4202 Building XEmacs; Allocation of Objects
4203 **************************************
4205 This chapter describes how the runnable XEmacs executable is dumped
4206 with the preloaded Lisp libraries in it and how storage is allocated.
4208 There is an entire separate document, the `XEmacs Internals Manual',
4209 devoted to the internals of XEmacs from the perspective of the C
4210 programmer. It contains much more detailed information about the build
4211 process, the allocation and garbage-collection process, and other
4212 aspects related to the internals of XEmacs.
4216 * Building XEmacs:: How to preload Lisp libraries into XEmacs.
4217 * Pure Storage:: A kludge to make preloaded Lisp functions sharable.
4218 * Garbage Collection:: Reclaiming space for Lisp objects no longer used.
4221 File: lispref.info, Node: Building XEmacs, Next: Pure Storage, Up: Building XEmacs and Object Allocation
4226 This section explains the steps involved in building the XEmacs
4227 executable. You don't have to know this material to build and install
4228 XEmacs, since the makefiles do all these things automatically. This
4229 information is pertinent to XEmacs maintenance.
4231 The `XEmacs Internals Manual' contains more information about this.
4233 Compilation of the C source files in the `src' directory produces an
4234 executable file called `temacs', also called a "bare impure XEmacs".
4235 It contains the XEmacs Lisp interpreter and I/O routines, but not the
4238 Before XEmacs is actually usable, a number of Lisp files need to be
4239 loaded. These define all the editing commands, plus most of the startup
4240 code and many very basic Lisp primitives. This is accomplished by
4241 loading the file `loadup.el', which in turn loads all of the other
4242 standardly-loaded Lisp files.
4244 It takes a substantial time to load the standard Lisp files.
4245 Luckily, you don't have to do this each time you run XEmacs; `temacs'
4246 can dump out an executable program called `xemacs' that has these files
4247 preloaded. `xemacs' starts more quickly because it does not need to
4248 load the files. This is the XEmacs executable that is normally
4251 To create `xemacs', use the command `temacs -batch -l loadup dump'.
4252 The purpose of `-batch' here is to tell `temacs' to run in
4253 non-interactive, command-line mode. (`temacs' can _only_ run in this
4254 fashion. Part of the code required to initialize frames and faces is
4255 in Lisp, and must be loaded before XEmacs is able to create any frames.)
4256 The argument `dump' tells `loadup.el' to dump a new executable named
4259 The dumping process is highly system-specific, and some operating
4260 systems don't support dumping. On those systems, you must start XEmacs
4261 with the `temacs -batch -l loadup run-temacs' command each time you use
4262 it. This takes a substantial time, but since you need to start Emacs
4263 once a day at most--or once a week if you never log out--the extra time
4264 is not too severe a problem. (In older versions of Emacs, you started
4265 Emacs from `temacs' using `temacs -l loadup'.)
4267 You are free to start XEmacs directly from `temacs' if you want,
4268 even if there is already a dumped `xemacs'. Normally you wouldn't want
4269 to do that; but the Makefiles do this when you rebuild XEmacs using
4270 `make all-elc', which builds XEmacs and simultaneously compiles any
4271 out-of-date Lisp files. (You need `xemacs' in order to compile Lisp
4272 files. However, you also need the compiled Lisp files in order to dump
4273 out `xemacs'. If both of these are missing or corrupted, you are out
4274 of luck unless you're able to bootstrap `xemacs' from `temacs'. Note
4275 that `make all-elc' actually loads the alternative loadup file
4276 `loadup-el.el', which works like `loadup.el' but disables the
4277 pure-copying process and forces XEmacs to ignore any compiled Lisp
4278 files even if they exist.)
4280 You can specify additional files to preload by writing a library
4281 named `site-load.el' that loads them. You may need to increase the
4282 value of `PURESIZE', in `src/puresize.h', to make room for the
4283 additional files. You should _not_ modify this file directly, however;
4284 instead, use the `--puresize' configuration option. (If you run out of
4285 pure space while dumping `xemacs', you will be told how much pure space
4286 you actually will need.) However, the advantage of preloading
4287 additional files decreases as machines get faster. On modern machines,
4288 it is often not advisable, especially if the Lisp code is on a file
4289 system local to the machine running XEmacs.
4291 You can specify other Lisp expressions to execute just before dumping
4292 by putting them in a library named `site-init.el'. However, if they
4293 might alter the behavior that users expect from an ordinary unmodified
4294 XEmacs, it is better to put them in `default.el', so that users can
4295 override them if they wish. *Note Start-up Summary::.
4297 Before `loadup.el' dumps the new executable, it finds the
4298 documentation strings for primitive and preloaded functions (and
4299 variables) in the file where they are stored, by calling
4300 `Snarf-documentation' (*note Accessing Documentation::). These strings
4301 were moved out of the `xemacs' executable to make it smaller. *Note
4302 Documentation Basics::.
4304 - Function: dump-emacs to-file from-file
4305 This function dumps the current state of XEmacs into an executable
4306 file TO-FILE. It takes symbols from FROM-FILE (this is normally
4307 the executable file `temacs').
4309 If you use this function in an XEmacs that was already dumped, you
4310 must set `command-line-processed' to `nil' first for good results.
4311 *Note Command Line Arguments::.
4313 - Function: run-emacs-from-temacs &rest args
4314 This is the function that implements the `run-temacs' command-line
4315 argument. It is called from `loadup.el' as appropriate. You
4316 should most emphatically _not_ call this yourself; it will
4317 reinitialize your XEmacs process and you'll be sorry.
4319 - Command: emacs-version &optional arg
4320 This function returns a string describing the version of XEmacs
4321 that is running. It is useful to include this string in bug
4324 When called interactively with a prefix argument, insert string at
4325 point. Don't use this function in programs to choose actions
4326 according to the system configuration; look at
4327 `system-configuration' instead.
4330 => "XEmacs 20.1 [Lucid] (i586-unknown-linux2.0.29)
4331 of Mon Apr 7 1997 on altair.xemacs.org"
4333 Called interactively, the function prints the same information in
4336 - Variable: emacs-build-time
4337 The value of this variable is the time at which XEmacs was built
4340 emacs-build-time "Mon Apr 7 20:28:52 1997"
4343 - Variable: emacs-version
4344 The value of this variable is the version of Emacs being run. It
4345 is a string, e.g. `"20.1 XEmacs Lucid"'.
4347 The following two variables did not exist before FSF GNU Emacs
4348 version 19.23 and XEmacs version 19.10, which reduces their usefulness
4349 at present, but we hope they will be convenient in the future.
4351 - Variable: emacs-major-version
4352 The major version number of Emacs, as an integer. For XEmacs
4353 version 20.1, the value is 20.
4355 - Variable: emacs-minor-version
4356 The minor version number of Emacs, as an integer. For XEmacs
4357 version 20.1, the value is 1.
4360 File: lispref.info, Node: Pure Storage, Next: Garbage Collection, Prev: Building XEmacs, Up: Building XEmacs and Object Allocation
4365 XEmacs Lisp uses two kinds of storage for user-created Lisp objects:
4366 "normal storage" and "pure storage". Normal storage is where all the
4367 new data created during an XEmacs session is kept; see the following
4368 section for information on normal storage. Pure storage is used for
4369 certain data in the preloaded standard Lisp files--data that should
4370 never change during actual use of XEmacs.
4372 Pure storage is allocated only while `temacs' is loading the
4373 standard preloaded Lisp libraries. In the file `xemacs', it is marked
4374 as read-only (on operating systems that permit this), so that the
4375 memory space can be shared by all the XEmacs jobs running on the machine
4376 at once. Pure storage is not expandable; a fixed amount is allocated
4377 when XEmacs is compiled, and if that is not sufficient for the preloaded
4378 libraries, `temacs' aborts with an error message. If that happens, you
4379 must increase the compilation parameter `PURESIZE' using the
4380 `--puresize' option to `configure'. This normally won't happen unless
4381 you try to preload additional libraries or add features to the standard
4384 - Function: purecopy object
4385 This function makes a copy of OBJECT in pure storage and returns
4386 it. It copies strings by simply making a new string with the same
4387 characters in pure storage. It recursively copies the contents of
4388 vectors and cons cells. It does not make copies of other objects
4389 such as symbols, but just returns them unchanged. It signals an
4390 error if asked to copy markers.
4392 This function is a no-op in XEmacs, and its use in new code is
4395 - Variable: pure-bytes-used
4396 The value of this variable is the number of bytes of pure storage
4397 allocated so far. Typically, in a dumped XEmacs, this number is
4398 very close to the total amount of pure storage available--if it
4399 were not, we would preallocate less.
4401 - Variable: purify-flag
4402 This variable determines whether `defun' should make a copy of the
4403 function definition in pure storage. If it is non-`nil', then the
4404 function definition is copied into pure storage.
4406 This flag is `t' while loading all of the basic functions for
4407 building XEmacs initially (allowing those functions to be sharable
4408 and non-collectible). Dumping XEmacs as an executable always
4409 writes `nil' in this variable, regardless of the value it actually
4410 has before and after dumping.
4412 You should not change this flag in a running XEmacs.
4415 File: lispref.info, Node: Garbage Collection, Prev: Pure Storage, Up: Building XEmacs and Object Allocation
4420 When a program creates a list or the user defines a new function (such
4421 as by loading a library), that data is placed in normal storage. If
4422 normal storage runs low, then XEmacs asks the operating system to
4423 allocate more memory in blocks of 2k bytes. Each block is used for one
4424 type of Lisp object, so symbols, cons cells, markers, etc., are
4425 segregated in distinct blocks in memory. (Vectors, long strings,
4426 buffers and certain other editing types, which are fairly large, are
4427 allocated in individual blocks, one per object, while small strings are
4428 packed into blocks of 8k bytes. [More correctly, a string is allocated
4429 in two sections: a fixed size chunk containing the length, list of
4430 extents, etc.; and a chunk containing the actual characters in the
4431 string. It is this latter chunk that is either allocated individually
4432 or packed into 8k blocks. The fixed size chunk is packed into 2k
4433 blocks, as for conses, markers, etc.])
4435 It is quite common to use some storage for a while, then release it
4436 by (for example) killing a buffer or deleting the last pointer to an
4437 object. XEmacs provides a "garbage collector" to reclaim this
4438 abandoned storage. (This name is traditional, but "garbage recycler"
4439 might be a more intuitive metaphor for this facility.)
4441 The garbage collector operates by finding and marking all Lisp
4442 objects that are still accessible to Lisp programs. To begin with, it
4443 assumes all the symbols, their values and associated function
4444 definitions, and any data presently on the stack, are accessible. Any
4445 objects that can be reached indirectly through other accessible objects
4446 are also accessible.
4448 When marking is finished, all objects still unmarked are garbage. No
4449 matter what the Lisp program or the user does, it is impossible to refer
4450 to them, since there is no longer a way to reach them. Their space
4451 might as well be reused, since no one will miss them. The second
4452 ("sweep") phase of the garbage collector arranges to reuse them.
4454 The sweep phase puts unused cons cells onto a "free list" for future
4455 allocation; likewise for symbols, markers, extents, events, floats,
4456 compiled-function objects, and the fixed-size portion of strings. It
4457 compacts the accessible small string-chars chunks so they occupy fewer
4458 8k blocks; then it frees the other 8k blocks. Vectors, buffers,
4459 windows, and other large objects are individually allocated and freed
4460 using `malloc' and `free'.
4462 Common Lisp note: unlike other Lisps, XEmacs Lisp does not call
4463 the garbage collector when the free list is empty. Instead, it
4464 simply requests the operating system to allocate more storage, and
4465 processing continues until `gc-cons-threshold' bytes have been
4468 This means that you can make sure that the garbage collector will
4469 not run during a certain portion of a Lisp program by calling the
4470 garbage collector explicitly just before it (provided that portion
4471 of the program does not use so much space as to force a second
4472 garbage collection).
4474 - Command: garbage-collect
4475 This command runs a garbage collection, and returns information on
4476 the amount of space in use. (Garbage collection can also occur
4477 spontaneously if you use more than `gc-cons-threshold' bytes of
4478 Lisp data since the previous garbage collection.)
4480 `garbage-collect' returns a list containing the following
4483 ((USED-CONSES . FREE-CONSES)
4484 (USED-SYMS . FREE-SYMS)
4485 (USED-MARKERS . FREE-MARKERS)
4490 => ((73362 . 8325) (13718 . 164)
4491 (5089 . 5098) 949121 118677
4492 (conses-used 73362 conses-free 8329 cons-storage 658168
4493 symbols-used 13718 symbols-free 164 symbol-storage 335216
4494 bit-vectors-used 0 bit-vectors-total-length 0
4495 bit-vector-storage 0 vectors-used 7882
4496 vectors-total-length 118677 vector-storage 537764
4497 compiled-functions-used 1336 compiled-functions-free 37
4498 compiled-function-storage 44440 short-strings-used 28829
4499 long-strings-used 2 strings-free 7722
4500 short-strings-total-length 916657 short-string-storage 1179648
4501 long-strings-total-length 32464 string-header-storage 441504
4502 floats-used 3 floats-free 43 float-storage 2044 markers-used 5089
4503 markers-free 5098 marker-storage 245280 events-used 103
4504 events-free 835 event-storage 110656 extents-used 10519
4505 extents-free 2718 extent-storage 372736
4506 extent-auxiliarys-used 111 extent-auxiliarys-freed 3
4507 extent-auxiliary-storage 4440 window-configurations-used 39
4508 window-configurations-on-free-list 5
4509 window-configurations-freed 10 window-configuration-storage 9492
4510 popup-datas-used 3 popup-data-storage 72 toolbar-buttons-used 62
4511 toolbar-button-storage 4960 toolbar-datas-used 12
4512 toolbar-data-storage 240 symbol-value-buffer-locals-used 182
4513 symbol-value-buffer-local-storage 5824
4514 symbol-value-lisp-magics-used 22
4515 symbol-value-lisp-magic-storage 1496
4516 symbol-value-varaliases-used 43
4517 symbol-value-varalias-storage 1032 opaque-lists-used 2
4518 opaque-list-storage 48 color-instances-used 12
4519 color-instance-storage 288 font-instances-used 5
4520 font-instance-storage 180 opaques-used 11 opaque-storage 312
4521 range-tables-used 1 range-table-storage 16 faces-used 34
4522 face-storage 2584 glyphs-used 124 glyph-storage 4464
4523 specifiers-used 775 specifier-storage 43869 weak-lists-used 786
4524 weak-list-storage 18864 char-tables-used 40
4525 char-table-storage 41920 buffers-used 25 buffer-storage 7000
4526 extent-infos-used 457 extent-infos-freed 73
4527 extent-info-storage 9140 keymaps-used 275 keymap-storage 12100
4528 consoles-used 4 console-storage 384 command-builders-used 2
4529 command-builder-storage 120 devices-used 2 device-storage 344
4530 frames-used 3 frame-storage 624 image-instances-used 47
4531 image-instance-storage 3008 windows-used 27 windows-freed 2
4532 window-storage 9180 lcrecord-lists-used 15
4533 lcrecord-list-storage 360 hash-tables-used 631
4534 hash-table-storage 25240 streams-used 1 streams-on-free-list 3
4535 streams-freed 12 stream-storage 91))
4537 Here is a table explaining each element:
4540 The number of cons cells in use.
4543 The number of cons cells for which space has been obtained
4544 from the operating system, but that are not currently being
4548 The number of symbols in use.
4551 The number of symbols for which space has been obtained from
4552 the operating system, but that are not currently being used.
4555 The number of markers in use.
4558 The number of markers for which space has been obtained from
4559 the operating system, but that are not currently being used.
4562 The total size of all strings, in characters.
4565 The total number of elements of existing vectors.
4568 A list of alternating keyword/value pairs providing more
4569 detailed information. (As you can see above, quite a lot of
4570 information is provided.)
4572 - User Option: gc-cons-threshold
4573 The value of this variable is the number of bytes of storage that
4574 must be allocated for Lisp objects after one garbage collection in
4575 order to trigger another garbage collection. A cons cell counts
4576 as eight bytes, a string as one byte per character plus a few
4577 bytes of overhead, and so on; space allocated to the contents of
4578 buffers does not count. Note that the subsequent garbage
4579 collection does not happen immediately when the threshold is
4580 exhausted, but only the next time the Lisp evaluator is called.
4582 The initial threshold value is 500,000. If you specify a larger
4583 value, garbage collection will happen less often. This reduces the
4584 amount of time spent garbage collecting, but increases total
4585 memory use. You may want to do this when running a program that
4586 creates lots of Lisp data.
4588 You can make collections more frequent by specifying a smaller
4589 value, down to 10,000. A value less than 10,000 will remain in
4590 effect only until the subsequent garbage collection, at which time
4591 `garbage-collect' will set the threshold back to 10,000. (This does
4592 not apply if XEmacs was configured with `--debug'. Therefore, be
4593 careful when setting `gc-cons-threshold' in that case!)
4595 - Variable: pre-gc-hook
4596 This is a normal hook to be run just before each garbage
4597 collection. Interrupts, garbage collection, and errors are
4598 inhibited while this hook runs, so be extremely careful in what
4599 you add here. In particular, avoid consing, and do not interact
4602 - Variable: post-gc-hook
4603 This is a normal hook to be run just after each garbage collection.
4604 Interrupts, garbage collection, and errors are inhibited while
4605 this hook runs, so be extremely careful in what you add here. In
4606 particular, avoid consing, and do not interact with the user.
4608 - Variable: gc-message
4609 This is a string to print to indicate that a garbage collection is
4610 in progress. This is printed in the echo area. If the selected
4611 frame is on a window system and `gc-pointer-glyph' specifies a
4612 value (i.e. a pointer image instance) in the domain of the
4613 selected frame, the mouse cursor will change instead of this
4614 message being printed.
4616 - Glyph: gc-pointer-glyph
4617 This holds the pointer glyph used to indicate that a garbage
4618 collection is in progress. If the selected window is on a window
4619 system and this glyph specifies a value (i.e. a pointer image
4620 instance) in the domain of the selected window, the cursor will be
4621 changed as specified during garbage collection. Otherwise, a
4622 message will be printed in the echo area, as controlled by
4623 `gc-message'. *Note Glyphs::.
4625 If XEmacs was configured with `--debug', you can set the following
4626 two variables to get direct information about all the allocation that
4627 is happening in a segment of Lisp code.
4629 - Variable: debug-allocation
4630 If non-zero, print out information to stderr about all objects
4633 - Variable: debug-allocation-backtrace
4634 Length (in stack frames) of short backtrace printed out by
4638 File: lispref.info, Node: Standard Errors, Next: Standard Buffer-Local Variables, Prev: Building XEmacs and Object Allocation, Up: Top
4643 Here is the complete list of the error symbols in standard Emacs,
4644 grouped by concept. The list includes each symbol's message (on the
4645 `error-message' property of the symbol) and a cross reference to a
4646 description of how the error can occur.
4648 Each error symbol has an `error-conditions' property that is a list
4649 of symbols. Normally this list includes the error symbol itself and
4650 the symbol `error'. Occasionally it includes additional symbols, which
4651 are intermediate classifications, narrower than `error' but broader
4652 than a single error symbol. For example, all the errors in accessing
4653 files have the condition `file-error'.
4655 As a special exception, the error symbol `quit' does not have the
4656 condition `error', because quitting is not considered an error.
4658 *Note Errors::, for an explanation of how errors are generated and
4673 `"Args out of range"'
4674 *Note Sequences Arrays Vectors::.
4677 `"Arithmetic error"'
4678 See `/' and `%' in *Note Numbers::.
4680 `beginning-of-buffer'
4681 `"Beginning of buffer"'
4685 `"Buffer is read-only"'
4686 *Note Read Only Buffers::.
4688 `cyclic-function-indirection'
4689 `"Symbol's chain of function indirections contains a loop"'
4690 *Note Function Indirection::.
4693 `"Arithmetic domain error"'
4699 `"End of file during parsing"'
4700 This is not a `file-error'.
4701 *Note Input Functions::.
4704 This error and its subcategories do not have error-strings,
4705 because the error message is constructed from the data items alone
4706 when the error condition `file-error' is present.
4710 This is a `file-error'.
4713 `file-already-exists'
4714 This is a `file-error'.
4715 *Note Writing to Files::.
4718 This is a `file-error'.
4719 *Note Modification Time::.
4722 `"Invalid byte code"'
4723 *Note Byte Compilation::.
4726 `"Invalid function"'
4727 *Note Classifying Lists::.
4729 `invalid-read-syntax'
4730 `"Invalid read syntax"'
4731 *Note Input Functions::.
4735 *Note Regular Expressions::.
4738 `"The mark is not active now"'
4740 `"No catch for tag"'
4741 *Note Catch and Throw::.
4744 `"Arithmetic overflow error"'
4746 `"Attempt to modify a protected field"'
4748 `"Arithmetic range error"'
4751 *Note Searching and Matching::.
4754 `"Attempt to set a constant symbol"'
4755 *Note Variables that Never Change: Constant Variables.
4758 `"Arithmetic singularity error"'
4761 *Note ToolTalk Support::.
4763 `undefined-keystroke-sequence'
4764 `"Undefined keystroke sequence"'
4766 `"Symbol's function definition is void"'
4767 *Note Function Cells::.
4770 `"Symbol's value as variable is void"'
4771 *Note Accessing Variables::.
4773 `wrong-number-of-arguments'
4774 `"Wrong number of arguments"'
4775 *Note Classifying Lists::.
4777 `wrong-type-argument'
4778 `"Wrong type argument"'
4779 *Note Type Predicates::.
4781 These error types, which are all classified as special cases of
4782 `arith-error', can occur on certain systems for invalid use of
4783 mathematical functions.
4786 `"Arithmetic domain error"'
4787 *Note Math Functions::.
4790 `"Arithmetic overflow error"'
4791 *Note Math Functions::.
4794 `"Arithmetic range error"'
4795 *Note Math Functions::.
4798 `"Arithmetic singularity error"'
4799 *Note Math Functions::.
4802 `"Arithmetic underflow error"'
4803 *Note Math Functions::.
4806 File: lispref.info, Node: Standard Buffer-Local Variables, Next: Standard Keymaps, Prev: Standard Errors, Up: Top
4808 Buffer-Local Variables
4809 **********************
4811 The table below lists the general-purpose Emacs variables that are
4812 automatically local (when set) in each buffer. Many Lisp packages
4813 define such variables for their internal use; we don't list them here.
4818 `auto-fill-function'
4819 *note Auto Filling::
4821 `buffer-auto-save-file-name'
4825 *note Backup Files::
4827 `buffer-display-table'
4828 *note Display Tables::
4830 `buffer-file-format'
4831 *note Format Conversion::
4834 *note Buffer File Name::
4836 `buffer-file-number'
4837 *note Buffer File Name::
4839 `buffer-file-truename'
4840 *note Buffer File Name::
4843 *note Files and MS-DOS::
4845 `buffer-invisibility-spec'
4846 *note Invisible Text::
4849 *note Saving Buffers::
4852 *note Read Only Buffers::
4860 `cache-long-line-scans'
4864 *note Searching and Case::
4867 *note Usual Display::
4870 *note Comments: (xemacs)Comments.
4873 *note System Environment::
4875 `defun-prompt-regexp'
4879 *note Auto Filling::
4882 *note Moving Point: (xemacs)Moving Point.
4887 `local-abbrev-table'
4890 `local-write-file-hooks'
4891 *note Saving Buffers::
4906 *note Modeline Data::
4908 `modeline-buffer-identification'
4909 *note Modeline Variables::
4912 *note Modeline Data::
4915 *note Modeline Variables::
4918 *note Modeline Variables::
4921 *note Modeline Variables::
4926 `paragraph-separate'
4927 *note Standard Regexps::
4930 *note Standard Regexps::
4932 `point-before-scroll'
4933 Used for communication between mouse commands and scroll-bar
4936 `require-final-newline'
4940 *note Selective Display::
4942 `selective-display-ellipses'
4943 *note Selective Display::
4946 *note Usual Display::
4952 *note Modeline Variables::
4955 File: lispref.info, Node: Standard Keymaps, Next: Standard Hooks, Prev: Standard Buffer-Local Variables, Up: Top
4960 The following symbols are used as the names for various keymaps. Some
4961 of these exist when XEmacs is first started, others are loaded only
4962 when their respective mode is used. This is not an exhaustive list.
4964 Almost all of these maps are used as local maps. Indeed, of the
4965 modes that presently exist, only Vip mode and Terminal mode ever change
4969 A keymap containing bindings to bookmark functions.
4971 `Buffer-menu-mode-map'
4972 A keymap used by Buffer Menu mode.
4975 A keymap used by C++ mode.
4978 A keymap used by C mode. A sparse keymap used by C mode.
4980 `command-history-map'
4981 A keymap used by Command History mode.
4984 A keymap for subcommands of the prefix `C-x 4'.
4987 A keymap for subcommands of the prefix `C-x 5'.
4990 A keymap for `C-x' commands.
4993 A keymap used by Debugger mode.
4996 A keymap for `dired-mode' buffers.
4999 A keymap used in `edit-abbrevs'.
5001 `edit-tab-stops-map'
5002 A keymap used in `edit-tab-stops'.
5004 `electric-buffer-menu-mode-map'
5005 A keymap used by Electric Buffer Menu mode.
5007 `electric-history-map'
5008 A keymap used by Electric Command History mode.
5010 `emacs-lisp-mode-map'
5011 A keymap used by Emacs Lisp mode.
5014 A keymap for characters following the Help key.
5017 A keymap used by the help utility package.
5018 It has the same keymap in its value cell and in its function cell.
5021 A keymap used by the `e' command of Info.
5024 A keymap containing Info commands.
5027 A keymap that defines the characters you can type within
5031 A keymap used when in Itimer Edit mode.
5033 `lisp-interaction-mode-map'
5034 A keymap used by Lisp mode.
5037 A keymap used by Lisp mode.
5039 A keymap for minibuffer input with completion.
5041 `minibuffer-local-isearch-map'
5042 A keymap for editing isearch strings in the minibuffer.
5044 `minibuffer-local-map'
5045 Default keymap to use when reading from the minibuffer.
5047 `minibuffer-local-must-match-map'
5048 A keymap for minibuffer input with completion, for exact match.
5051 The keymap for characters following `C-c'. Note, this is in the
5052 global map. This map is not actually mode specific: its name was
5053 chosen to be informative for the user in `C-h b'
5054 (`display-bindings'), where it describes the main use of the `C-c'
5058 The keymap consulted for mouse-clicks on the modeline of a window.
5061 A keymap used in Objective C mode as a local map.
5064 A local keymap used by Occur mode.
5066 `overriding-local-map'
5067 A keymap that overrides all other local keymaps.
5070 A local keymap used for responses in `query-replace' and related
5071 commands; also for `y-or-n-p' and `map-y-or-n-p'. The functions
5072 that use this map do not support prefix keys; they look up one
5075 `read-expression-map'
5076 The minibuffer keymap used for reading Lisp expressions.
5078 `read-shell-command-map'
5079 The minibuffer keymap used by `shell-command' and related commands.
5081 `shared-lisp-mode-map'
5082 A keymap for commands shared by all sorts of Lisp modes.
5085 A keymap used by Text mode.
5088 The keymap consulted for mouse-clicks over a toolbar.
5091 A keymap used by View mode.
5094 File: lispref.info, Node: Standard Hooks, Next: Index, Prev: Standard Keymaps, Up: Top
5099 The following is a list of hook variables that let you provide
5100 functions to be called from within Emacs on suitable occasions.
5102 Most of these variables have names ending with `-hook'. They are
5103 "normal hooks", run by means of `run-hooks'. The value of such a hook
5104 is a list of functions. The recommended way to put a new function on
5105 such a hook is to call `add-hook'. *Note Hooks::, for more information
5108 The variables whose names end in `-function' have single functions
5109 as their values. Usually there is a specific reason why the variable is
5110 not a normal hook, such as the need to pass arguments to the function.
5111 (In older Emacs versions, some of these variables had names ending in
5112 `-hook' even though they were not normal hooks.)
5114 The variables whose names end in `-hooks' or `-functions' have lists
5115 of functions as their values, but these functions are called in a
5116 special way (they are passed arguments, or else their values are used).
5118 `activate-menubar-hook'
5120 `activate-popup-menu-hook'
5122 `ad-definition-hooks'
5124 `adaptive-fill-function'
5126 `add-log-current-defun-function'
5128 `after-change-functions'
5130 `after-delete-annotation-hook'
5134 `after-insert-file-functions'
5140 `after-set-visited-file-name-hooks'
5142 `after-write-file-hooks'
5144 `auto-fill-function'
5148 `before-change-functions'
5150 `before-delete-annotation-hook'
5154 `before-revert-hook'
5156 `blink-paren-function'
5158 `buffers-menu-switch-to-buffer-function'
5164 `c-mode-common-hook'
5168 `c-special-indent-hook'
5170 `calendar-load-hook'
5172 `change-major-mode-hook'
5174 `command-history-hook'
5176 `comment-indent-function'
5178 `compilation-buffer-name-function'
5180 `compilation-exit-message-function'
5182 `compilation-finish-function'
5184 `compilation-parse-errors-function'
5186 `compilation-mode-hook'
5188 `create-console-hook'
5190 `create-device-hook'
5194 `dabbrev-friend-buffer-function'
5196 `dabbrev-select-buffers-function'
5198 `delete-console-hook'
5200 `delete-device-hook'
5204 `deselect-frame-hook'
5206 `diary-display-hook'
5210 `dired-after-readin-hook'
5212 `dired-before-readin-hook'
5218 `disabled-command-hook'
5220 `display-buffer-function'
5222 `ediff-after-setup-control-frame-hook'
5224 `ediff-after-setup-windows-hook'
5226 `ediff-before-setup-control-frame-hook'
5228 `ediff-before-setup-windows-hook'
5230 `ediff-brief-help-message-function'
5232 `ediff-cleanup-hook'
5234 `ediff-control-frame-position-function'
5236 `ediff-display-help-hook'
5238 `ediff-focus-on-regexp-matches-function'
5240 `ediff-forward-word-function'
5242 `ediff-hide-regexp-matches-function'
5244 `ediff-keymap-setup-hook'
5248 `ediff-long-help-message-function'
5250 `ediff-make-wide-display-function'
5252 `ediff-merge-split-window-function'
5254 `ediff-meta-action-function'
5256 `ediff-meta-redraw-function'
5260 `ediff-prepare-buffer-hook'
5264 `ediff-registry-setup-hook'
5268 `ediff-session-action-function'
5270 `ediff-session-group-setup-hook'
5272 `ediff-setup-diff-regions-function'
5274 `ediff-show-registry-hook'
5276 `ediff-show-session-group-hook'
5278 `ediff-skip-diff-region-function'
5280 `ediff-split-window-function'
5282 `ediff-startup-hook'
5284 `ediff-suspend-hook'
5286 `ediff-toggle-read-only-function'
5288 `ediff-unselect-hook'
5290 `ediff-window-setup-function'
5294 `electric-buffer-menu-mode-hook'
5296 `electric-command-history-hook'
5298 `electric-help-mode-hook'
5300 `emacs-lisp-mode-hook'
5302 `fill-paragraph-function'
5306 `find-file-not-found-hooks'
5310 `font-lock-after-fontify-buffer-hook'
5312 `font-lock-beginning-of-syntax-function'
5314 `font-lock-mode-hook'
5316 `fume-found-function-hook'
5318 `fume-list-mode-hook'
5320 `fume-rescan-buffer-hook'
5322 `fume-sort-function'
5326 `hack-local-variables-hook'
5328 `highlight-headers-follow-url-function'
5330 `hyper-apropos-mode-hook'
5332 `indent-line-function'
5336 `indent-region-function'
5338 `initial-calendar-window-hook'
5340 `isearch-mode-end-hook'
5348 `kill-buffer-query-functions'
5352 `kill-emacs-query-functions'
5362 `lisp-indent-function'
5364 `lisp-interaction-mode-hook'
5368 `list-diary-entries-hook'
5370 `load-read-function'
5372 `log-message-filter-function'
5376 `mail-citation-hook'
5382 `make-annotation-hook'
5384 `makefile-mode-hook'
5388 `mark-diary-entries-hook'
5392 `menu-no-selection-hook'
5394 `mh-compose-letter-hook'
5396 `mh-folder-mode-hook'
5398 `mh-letter-mode-hook'
5402 `minibuffer-exit-hook'
5404 `minibuffer-setup-hook'
5408 `mouse-enter-frame-hook'
5410 `mouse-leave-frame-hook'
5412 `mouse-track-cleanup-hook'
5414 `mouse-track-click-hook'
5416 `mouse-track-down-hook'
5418 `mouse-track-drag-hook'
5420 `mouse-track-drag-up-hook'
5422 `mouse-track-up-hook'
5424 `mouse-yank-function'
5428 `news-reply-mode-hook'
5432 `nongregorian-diary-listing-hook'
5434 `nongregorian-diary-marking-hook'
5444 `plain-TeX-mode-hook'
5450 `pre-abbrev-expand-hook'
5454 `pre-display-buffer-function'
5460 `print-diary-entries-hook'
5464 `protect-innocence-hook'
5466 `remove-message-hook'
5468 `revert-buffer-function'
5470 `revert-buffer-insert-contents-function'
5472 `rmail-edit-mode-hook'
5476 `rmail-retry-setup-hook'
5478 `rmail-summary-mode-hook'
5480 `scheme-indent-hook'
5488 `send-mail-function'
5492 `shell-set-directory-error-hook'
5494 `special-display-function'
5498 `suspend-resume-hook'
5500 `temp-buffer-show-function'
5504 `terminal-mode-hook'
5506 `terminal-mode-break-hook'
5514 `today-visible-calendar-hook'
5516 `today-invisible-calendar-hook'
5518 `tooltalk-message-handler-hook'
5520 `tooltalk-pattern-handler-hook'
5522 `tooltalk-unprocessed-message-hook'
5528 `vc-checkout-writable-buffer-hook'
5530 `vc-log-after-operation-hook'
5532 `vc-make-buffer-writable-hook'
5536 `vm-arrived-message-hook'
5538 `vm-arrived-messages-hook'
5540 `vm-chop-full-name-function'
5542 `vm-display-buffer-hook'
5544 `vm-edit-message-hook'
5546 `vm-forward-message-hook'
5548 `vm-iconify-frame-hook'
5550 `vm-inhibit-write-file-hook'
5558 `vm-menu-setup-hook'
5564 `vm-rename-current-buffer-function'
5568 `vm-resend-bounced-message-hook'
5570 `vm-resend-message-hook'
5572 `vm-retrieved-spooled-mail-hook'
5574 `vm-select-message-hook'
5576 `vm-select-new-message-hook'
5578 `vm-select-unread-message-hook'
5580 `vm-send-digest-hook'
5582 `vm-summary-mode-hook'
5584 `vm-summary-pointer-update-hook'
5586 `vm-summary-redo-hook'
5588 `vm-summary-update-hook'
5590 `vm-undisplay-buffer-hook'
5592 `vm-visit-folder-hook'
5596 `write-contents-hooks'
5598 `write-file-data-hooks'
5602 `write-region-annotate-functions'
5604 `x-lost-selection-hooks'
5606 `x-sent-selection-hooks'
5608 `zmacs-activate-region-hook'
5610 `zmacs-deactivate-region-hook'
5612 `zmacs-update-region-hook'