X-Git-Url: http://git.chise.org/gitweb/?a=blobdiff_plain;f=info%2Flispref.info-43;h=18337d74dedd8ccc6def722f538daa251a29e65a;hb=0b6cc849a8a353d01b8e5b001fcc27284d50ded8;hp=d39722dc9c93e481f7ea178abdf30787b075459d;hpb=f52a96980ed9280f8f906a20d4b899dc0b027644;p=chise%2Fxemacs-chise.git diff --git a/info/lispref.info-43 b/info/lispref.info-43 index d39722d..18337d7 100644 --- a/info/lispref.info-43 +++ b/info/lispref.info-43 @@ -50,985 +50,1162 @@ may be included in a translation approved by the Free Software Foundation instead of in the original English.  -File: lispref.info, Node: Internationalization Terminology, Next: Charsets, Up: MULE +File: lispref.info, Node: libpq Lisp Symbols and DataTypes, Next: Synchronous Interface Functions, Prev: libpq Lisp Variables, Up: XEmacs PostgreSQL libpq API -Internationalization Terminology -================================ +libpq Lisp Symbols and Datatypes +-------------------------------- - In internationalization terminology, a string of text is divided up -into "characters", which are the printable units that make up the text. -A single character is (for example) a capital `A', the number `2', a -Katakana character, a Hangul character, a Kanji ideograph (an -"ideograph" is a "picture" character, such as is used in Japanese -Kanji, Chinese Hanzi, and Korean Hanja; typically there are thousands -of such ideographs in each language), etc. The basic property of a -character is that it is the smallest unit of text with semantic -significance in text processing. - - Human beings normally process text visually, so to a first -approximation a character may be identified with its shape. Note that -the same character may be drawn by two different people (or in two -different fonts) in slightly different ways, although the "basic shape" -will be the same. But consider the works of Scott Kim; human beings -can recognize hugely variant shapes as the "same" character. -Sometimes, especially where characters are extremely complicated to -write, completely different shapes may be defined as the "same" -character in national standards. The Taiwanese variant of Hanzi is -generally the most complicated; over the centuries, the Japanese, -Koreans, and the People's Republic of China have adopted -simplifications of the shape, but the line of descent from the original -shape is recorded, and the meanings and pronunciation of different -forms of the same character are considered to be identical within each -language. (Of course, it may take a specialist to recognize the -related form; the point is that the relations are standardized, despite -the differing shapes.) - - In some cases, the differences will be significant enough that it is -actually possible to identify two or more distinct shapes that both -represent the same character. For example, the lowercase letters `a' -and `g' each have two distinct possible shapes--the `a' can optionally -have a curved tail projecting off the top, and the `g' can be formed -either of two loops, or of one loop and a tail hanging off the bottom. -Such distinct possible shapes of a character are called "glyphs". The -important characteristic of two glyphs making up the same character is -that the choice between one or the other is purely stylistic and has no -linguistic effect on a word (this is the reason why a capital `A' and -lowercase `a' are different characters rather than different -glyphs--e.g. `Aspen' is a city while `aspen' is a kind of tree). - - Note that "character" and "glyph" are used differently here than -elsewhere in XEmacs. - - A "character set" is essentially a set of related characters. ASCII, -for example, is a set of 94 characters (or 128, if you count -non-printing characters). Other character sets are ISO8859-1 (ASCII -plus various accented characters and other international symbols), JIS -X 0201 (ASCII, more or less, plus half-width Katakana), JIS X 0208 -(Japanese Kanji), JIS X 0212 (a second set of less-used Japanese Kanji), -GB2312 (Mainland Chinese Hanzi), etc. - - The definition of a character set will implicitly or explicitly give -it an "ordering", a way of assigning a number to each character in the -set. For many character sets, there is a natural ordering, for example -the "ABC" ordering of the Roman letters. But it is not clear whether -digits should come before or after the letters, and in fact different -European languages treat the ordering of accented characters -differently. It is useful to use the natural order where available, of -course. The number assigned to any particular character is called the -character's "code point". (Within a given character set, each -character has a unique code point. Thus the word "set" is ill-chosen; -different orderings of the same characters are different character sets. -Identifying characters is simple enough for alphabetic character sets, -but the difference in ordering can cause great headaches when the same -thousands of characters are used by different cultures as in the Hanzi.) - - A code point may be broken into a number of "position codes". The -number of position codes required to index a particular character in a -character set is called the "dimension" of the character set. For -practical purposes, a position code may be thought of as a byte-sized -index. The printing characters of ASCII, being a relatively small -character set, is of dimension one, and each character in the set is -indexed using a single position code, in the range 1 through 94. Use of -this unusual range, rather than the familiar 33 through 126, is an -intentional abstraction; to understand the programming issues you must -break the equation between character sets and encodings. - - JIS X 0208, i.e. Japanese Kanji, has thousands of characters, and is -of dimension two - every character is indexed by two position codes, -each in the range 1 through 94. (This number "94" is not a -coincidence; we shall see that the JIS position codes were chosen so -that JIS kanji could be encoded without using codes that in ASCII are -associated with device control functions.) Note that the choice of the -range here is somewhat arbitrary. You could just as easily index the -printing characters in ASCII using numbers in the range 0 through 93, 2 -through 95, 3 through 96, etc. In fact, the standardized _encoding_ -for the ASCII _character set_ uses the range 33 through 126. - - An "encoding" is a way of numerically representing characters from -one or more character sets into a stream of like-sized numerical values -called "words"; typically these are 8-bit, 16-bit, or 32-bit -quantities. If an encoding encompasses only one character set, then the -position codes for the characters in that character set could be used -directly. (This is the case with the trivial cipher used by children, -assigning 1 to `A', 2 to `B', and so on.) However, even with ASCII, -other considerations intrude. For example, why are the upper- and -lowercase alphabets separated by 8 characters? Why do the digits start -with `0' being assigned the code 48? In both cases because semantically -interesting operations (case conversion and numerical value extraction) -become convenient masking operations. Other artificial aspects (the -control characters being assigned to codes 0-31 and 127) are historical -accidents. (The use of 127 for `DEL' is an artifact of the "punch -once" nature of paper tape, for example.) - - Naive use of the position code is not possible, however, if more than -one character set is to be used in the encoding. For example, printed -Japanese text typically requires characters from multiple character sets -- ASCII, JIS X 0208, and JIS X 0212, to be specific. Each of these is -indexed using one or more position codes in the range 1 through 94, so -the position codes could not be used directly or there would be no way -to tell which character was meant. Different Japanese encodings handle -this differently - JIS uses special escape characters to denote -different character sets; EUC sets the high bit of the position codes -for JIS X 0208 and JIS X 0212, and puts a special extra byte before each -JIS X 0212 character; etc. (JIS, EUC, and most of the other encodings -you will encounter in files are 7-bit or 8-bit encodings. There is one -common 16-bit encoding, which is Unicode; this strives to represent all -the world's characters in a single large character set. 32-bit -encodings are often used internally in programs, such as XEmacs with -MULE support, to simplify the code that manipulates them; however, they -are not used externally because they are not very space-efficient.) - - A general method of handling text using multiple character sets -(whether for multilingual text, or simply text in an extremely -complicated single language like Japanese) is defined in the -international standard ISO 2022. ISO 2022 will be discussed in more -detail later (*note ISO 2022::), but for now suffice it to say that text -needs control functions (at least spacing), and if escape sequences are -to be used, an escape sequence introducer. It was decided to make all -text streams compatible with ASCII in the sense that the codes 0-31 -(and 128-159) would always be control codes, never graphic characters, -and where defined by the character set the `SPC' character would be -assigned code 32, and `DEL' would be assigned 127. Thus there are 94 -code points remaining if 7 bits are used. This is the reason that most -character sets are defined using position codes in the range 1 through -94. Then ISO 2022 compatible encodings are produced by shifting the -position codes 1 to 94 into character codes 33 to 126, or (if 8 bit -codes are available) into character codes 161 to 254. - - Encodings are classified as either "modal" or "non-modal". In a -"modal encoding", there are multiple states that the encoding can be -in, and the interpretation of the values in the stream depends on the -current global state of the encoding. Special values in the encoding, -called "escape sequences", are used to change the global state. JIS, -for example, is a modal encoding. The bytes `ESC $ B' indicate that, -from then on, bytes are to be interpreted as position codes for JIS X -0208, rather than as ASCII. This effect is cancelled using the bytes -`ESC ( B', which mean "switch from whatever the current state is to -ASCII". To switch to JIS X 0212, the escape sequence `ESC $ ( D'. -(Note that here, as is common, the escape sequences do in fact begin -with `ESC'. This is not necessarily the case, however. Some encodings -use control characters called "locking shifts" (effect persists until -cancelled) to switch character sets.) - - A "non-modal encoding" has no global state that extends past the -character currently being interpreted. EUC, for example, is a -non-modal encoding. Characters in JIS X 0208 are encoded by setting -the high bit of the position codes, and characters in JIS X 0212 are -encoded by doing the same but also prefixing the character with the -byte 0x8F. - - The advantage of a modal encoding is that it is generally more -space-efficient, and is easily extendable because there are essentially -an arbitrary number of escape sequences that can be created. The -disadvantage, however, is that it is much more difficult to work with -if it is not being processed in a sequential manner. In the non-modal -EUC encoding, for example, the byte 0x41 always refers to the letter -`A'; whereas in JIS, it could either be the letter `A', or one of the -two position codes in a JIS X 0208 character, or one of the two -position codes in a JIS X 0212 character. Determining exactly which -one is meant could be difficult and time-consuming if the previous -bytes in the string have not already been processed, or impossible if -they are drawn from an external stream that cannot be rewound. - - Non-modal encodings are further divided into "fixed-width" and -"variable-width" formats. A fixed-width encoding always uses the same -number of words per character, whereas a variable-width encoding does -not. EUC is a good example of a variable-width encoding: one to three -bytes are used per character, depending on the character set. 16-bit -and 32-bit encodings are nearly always fixed-width, and this is in fact -one of the main reasons for using an encoding with a larger word size. -The advantages of fixed-width encodings should be obvious. The -advantages of variable-width encodings are that they are generally more -space-efficient and allow for compatibility with existing 8-bit -encodings such as ASCII. (For example, in Unicode ASCII characters are -simply promoted to a 16-bit representation. That means that every -ASCII character contains a `NUL' byte; evidently all of the standard -string manipulation functions will lose badly in a fixed-width Unicode -environment.) - - The bytes in an 8-bit encoding are often referred to as "octets" -rather than simply as bytes. This terminology dates back to the days -before 8-bit bytes were universal, when some computers had 9-bit bytes, -others had 10-bit bytes, etc. + The following set of symbols are used to represent the intermediate +states involved in the asynchronous interface. - -File: lispref.info, Node: Charsets, Next: MULE Characters, Prev: Internationalization Terminology, Up: MULE + - Symbol: pgres::polling-failed + Undocumented. A fatal error has occurred during processing of an + asynchronous operation. -Charsets -======== + - Symbol: pgres::polling-reading + An intermediate status return during an asynchronous operation. It + indicates that one may use `select' before polling again. - A "charset" in MULE is an object that encapsulates a particular -character set as well as an ordering of those characters. Charsets are -permanent objects and are named using symbols, like faces. + - Symbol: pgres::polling-writing + An intermediate status return during an asynchronous operation. It + indicates that one may use `select' before polling again. - - Function: charsetp object - This function returns non-`nil' if OBJECT is a charset. + - Symbol: pgres::polling-ok + An asynchronous operation has successfully completed. -* Menu: + - Symbol: pgres::polling-active + An intermediate status return during an asynchronous operation. + One can call the poll function again immediately. -* Charset Properties:: Properties of a charset. -* Basic Charset Functions:: Functions for working with charsets. -* Charset Property Functions:: Functions for accessing charset properties. -* Predefined Charsets:: Predefined charset objects. + - Function: pq-pgconn conn field + CONN A database connection object. FIELD A symbol indicating + which field of PGconn to fetch. Possible values are shown in the + following table. + `pq::db' + Database name - -File: lispref.info, Node: Charset Properties, Next: Basic Charset Functions, Up: Charsets + `pq::user' + Database user name -Charset Properties ------------------- + `pq::pass' + Database user's password + + `pq::host' + Hostname database server is running on + + `pq::port' + TCP port number used in the connection + + `pq::tty' + Debugging TTY + + Compatibility note: Debugging TTYs are not used in the + XEmacs Lisp API. + + `pq::options' + Additional server options + + `pq::status' + Connection status. Possible return values are shown in the + following table. + `pg::connection-ok' + The normal, connected status. + + `pg::connection-bad' + The connection is not open and the PGconn object needs + to be deleted by `pq-finish'. + + `pg::connection-started' + An asynchronous connection has been started, but is not + yet complete. + + `pg::connection-made' + An asynchronous connect has been made, and there is data + waiting to be sent. + + `pg::connection-awaiting-response' + Awaiting data from the backend during an asynchronous + connection. + + `pg::connection-auth-ok' + Received authentication, waiting for the backend to + start up. + + `pg::connection-setenv' + Negotiating environment during an asynchronous + connection. + + `pq::error-message' + The last error message that was delivered to this connection. - Charsets have the following properties: - -`name' - A symbol naming the charset. Every charset must have a different - name; this allows a charset to be referred to using its name - rather than the actual charset object. - -`doc-string' - A documentation string describing the charset. - -`registry' - A regular expression matching the font registry field for this - character set. For example, both the `ascii' and `latin-iso8859-1' - charsets use the registry `"ISO8859-1"'. This field is used to - choose an appropriate font when the user gives a general font - specification such as `-*-courier-medium-r-*-140-*', i.e. a - 14-point upright medium-weight Courier font. - -`dimension' - Number of position codes used to index a character in the - character set. XEmacs/MULE can only handle character sets of - dimension 1 or 2. This property defaults to 1. - -`chars' - Number of characters in each dimension. In XEmacs/MULE, the only - allowed values are 94 or 96. (There are a couple of pre-defined - character sets, such as ASCII, that do not follow this, but you - cannot define new ones like this.) Defaults to 94. Note that if - the dimension is 2, the character set thus described is 94x94 or - 96x96. - -`columns' - Number of columns used to display a character in this charset. - Only used in TTY mode. (Under X, the actual width of a character - can be derived from the font used to display the characters.) If - unspecified, defaults to the dimension. (This is almost always the - correct value, because character sets with dimension 2 are usually - ideograph character sets, which need two columns to display the - intricate ideographs.) - -`direction' - A symbol, either `l2r' (left-to-right) or `r2l' (right-to-left). - Defaults to `l2r'. This specifies the direction that the text - should be displayed in, and will be left-to-right for most - charsets but right-to-left for Hebrew and Arabic. (Right-to-left - display is not currently implemented.) - -`final' - Final byte of the standard ISO 2022 escape sequence designating - this charset. Must be supplied. Each combination of (DIMENSION, - CHARS) defines a separate namespace for final bytes, and each - charset within a particular namespace must have a different final - byte. Note that ISO 2022 restricts the final byte to the range - 0x30 - 0x7E if dimension == 1, and 0x30 - 0x5F if dimension == 2. - Note also that final bytes in the range 0x30 - 0x3F are reserved - for user-defined (not official) character sets. For more - information on ISO 2022, see *Note Coding Systems::. - -`graphic' - 0 (use left half of font on output) or 1 (use right half of font on - output). Defaults to 0. This specifies how to convert the - position codes that index a character in a character set into an - index into the font used to display the character set. With - `graphic' set to 0, position codes 33 through 126 map to font - indices 33 through 126; with it set to 1, position codes 33 - through 126 map to font indices 161 through 254 (i.e. the same - number but with the high bit set). For example, for a font whose - registry is ISO8859-1, the left half of the font (octets 0x20 - - 0x7F) is the `ascii' charset, while the right half (octets 0xA0 - - 0xFF) is the `latin-iso8859-1' charset. - -`ccl-program' - A compiled CCL program used to convert a character in this charset - into an index into the font. This is in addition to the `graphic' - property. If a CCL program is defined, the position codes of a - character will first be processed according to `graphic' and then - passed through the CCL program, with the resulting values used to - index the font. - - This is used, for example, in the Big5 character set (used in - Taiwan). This character set is not ISO-2022-compliant, and its - size (94x157) does not fit within the maximum 96x96 size of - ISO-2022-compliant character sets. As a result, XEmacs/MULE - splits it (in a rather complex fashion, so as to group the most - commonly used characters together) into two charset objects - (`big5-1' and `big5-2'), each of size 94x94, and each charset - object uses a CCL program to convert the modified position codes - back into standard Big5 indices to retrieve a character from a - Big5 font. - - Most of the above properties can only be set when the charset is -initialized, and cannot be changed later. *Note Charset Property -Functions::. + `pq::backend-pid' + The process ID of the backend database server. + + The `PGresult' object is used by libpq to encapsulate the results of +queries. The printed representation takes on four forms. When the +PGresult object contains tuples from an SQL `SELECT' it will look like: + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + + The number in brackets indicates how many rows of data are available. +When the PGresult object is the result of a command query that doesn't +return anything, it will look like: + + (pq-exec P "CREATE TABLE a_new_table (i int);") + => # + + When either the query is a command-type query that can affect a +number of different rows, but doesn't return any of them it will look +like: + + (progn + (pq-exec P "INSERT INTO a_new_table VALUES (1);") + (pq-exec P "INSERT INTO a_new_table VALUES (2);") + (pq-exec P "INSERT INTO a_new_table VALUES (3);") + (setq R (pq-exec P "DELETE FROM a_new_table;"))) + => # + + Lastly, when the underlying PGresult object has been deallocated +directly by `pq-clear' the printed representation will look like: + + (progn + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + (pq-clear R) + R) + => # + + The following set of functions are accessors to various data in the +PGresult object. + + - Function: pq-result-status result + Return status of a query result. RESULT is a PGresult object. + The return value is one of the symbols in the following table. + `pgres::empty-query' + A query contained no text. This is usually the result of a + recoverable error, or a minor programming error. + + `pgres::command-ok' + A query command that doesn't return anything was executed + properly by the backend. + + `pgres::tuples-ok' + A query command that returns tuples was executed properly by + the backend. + + `pgres::copy-out' + Copy Out data transfer is in progress. + + `pgres::copy-in' + Copy In data transfer is in progress. + + `pgres::bad-response' + An unexpected response was received from the backend. + + `pgres::nonfatal-error' + Undocumented. This value is returned when the libpq function + `PQresultStatus' is called with a NULL pointer. + + `pgres::fatal-error' + Undocumented. An error has occurred in processing the query + and the operation was not completed. + + - Function: pq-res-status result + Return the query result status as a string, not a symbol. RESULT + is a PGresult object. + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-res-status R) + => "PGRES_TUPLES_OK" + + - Function: pq-result-error-message result + Return an error message generated by the query, if any. RESULT is + a PGresult object. + + (setq R (pq-exec P "SELECT * FROM xemacs-test;")) + => + (pq-result-error-message R) + => "ERROR: parser: parse error at or near \"-\" + " + + - Function: pq-ntuples result + Return the number of tuples in the query result. RESULT is a + PGresult object. + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-ntuples R) + => 5 + + - Function: pq-nfields result + Return the number of fields in each tuple of the query result. + RESULT is a PGresult object. + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-nfields R) + => 3 + + - Function: pq-binary-tuples result + Returns t if binary tuples are present in the results, nil + otherwise. RESULT is a PGresult object. + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-binary-tuples R) + => nil + + - Function: pq-fname result field-index + Returns the name of a specific field. RESULT is a PGresult object. + FIELD-INDEX is the number of the column to select from. The first + column is number zero. + + (let (i l) + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + (setq i (pq-nfields R)) + (while (>= (decf i) 0) + (push (pq-fname R i) l)) + l) + => ("id" "shikona" "rank") + + - Function: pq-fnumber result field-name + Return the field number corresponding to the given field name. -1 + is returned on a bad field name. RESULT is a PGresult object. + FIELD-NAME is a string representing the field name to find. + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-fnumber R "id") + => 0 + (pq-fnumber R "Not a field") + => -1 + + - Function: pq-ftype result field-num + Return an integer code representing the data type of the specified + column. RESULT is a PGresult object. FIELD-NUM is the field + number. + + The return value of this function is the Object ID (Oid) in the + database of the type. Further queries need to be made to various + system tables in order to convert this value into something useful. + + - Function: pq-fmod result field-num + Return the type modifier code associated with a field. Field + numbers start at zero. RESULT is a PGresult object. FIELD-INDEX + selects which field to use. + + - Function: pq-fsize result field-index + Return size of the given field. RESULT is a PGresult object. + FIELD-INDEX selects which field to use. + + (let (i l) + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + (setq i (pq-nfields R)) + (while (>= (decf i) 0) + (push (list (pq-ftype R i) (pq-fsize R i)) l)) + l) + => ((23 23) (25 25) (25 25)) + + - Function: pq-get-value result tup-num field-num + Retrieve a return value. RESULT is a PGresult object. TUP-NUM + selects which tuple to fetch from. FIELD-NUM selects which field + to fetch from. + + Both tuples and fields are numbered from zero. + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-get-value R 0 1) + => "Musashimaru" + (pq-get-value R 1 1) + => "Dejima" + (pq-get-value R 2 1) + => "Musoyama" + + - Function: pq-get-length result tup-num field-num + Return the length of a specific value. RESULT is a PGresult + object. TUP-NUM selects which tuple to fetch from. FIELD-NUM + selects which field to fetch from. + + (setq R (pq-exec P "SELECT * FROM xemacs_test;")) + => # + (pq-get-length R 0 1) + => 11 + (pq-get-length R 1 1) + => 6 + (pq-get-length R 2 1) + => 8 + + - Function: pq-get-is-null result tup-num field-num + Return t if the specific value is the SQL NULL. RESULT is a + PGresult object. TUP-NUM selects which tuple to fetch from. + FIELD-NUM selects which field to fetch from. + + - Function: pq-cmd-status result + Return a summary string from the query. RESULT is a PGresult + object. + (pq-exec P "INSERT INTO xemacs_test + VALUES (6, 'Wakanohana', 'Yokozuna');") + => # + (pq-cmd-status R) + => "INSERT 542086 1" + (setq R (pq-exec P "UPDATE xemacs_test SET rank='retired' + WHERE shikona='Wakanohana';")) + => # + (pq-cmd-status R) + => "UPDATE 1" + + Note that the first number returned from an insertion, like in the + example, is an object ID number and will almost certainly vary from + system to system since object ID numbers in Postgres must be unique + across all databases. + + - Function: pq-cmd-tuples result + Return the number of tuples if the last command was an + INSERT/UPDATE/DELETE. If the last command was something else, the + empty string is returned. RESULT is a PGresult object. + + (setq R (pq-exec P "INSERT INTO xemacs_test VALUES + (7, 'Takanohana', 'Yokuzuna');")) + => # + (pq-cmd-tuples R) + => "1" + (setq R (pq-exec P "SELECT * from xemacs_test;")) + => # + (pq-cmd-tuples R) + => "" + (setq R (pq-exec P "DELETE FROM xemacs_test + WHERE shikona LIKE '%hana';")) + => # + (pq-cmd-tuples R) + => "2" + + - Function: pq-oid-value result + Return the object id of the insertion if the last command was an + INSERT. 0 is returned if the last command was not an insertion. + RESULT is a PGresult object. + + In the first example, the numbers you will see on your local + system will almost certainly be different, however the second + number from the right in the unprintable PGresult object and the + number returned by `pq-oid-value' should match. + (setq R (pq-exec P "INSERT INTO xemacs_test VALUES + (8, 'Terao', 'Maegashira');")) + => # + (pq-oid-value R) + => 542089 + (setq R (pq-exec P "SELECT shikona FROM xemacs_test + WHERE rank='Maegashira';")) + => # + (pq-oid-value R) + => 0 + + - Function: pq-make-empty-pgresult conn status + Create an empty pgresult with the given status. CONN a database + connection object STATUS a value that can be returned by + `pq-result-status'. + + The caller is responsible for making sure the return value gets + properly freed.  -File: lispref.info, Node: Basic Charset Functions, Next: Charset Property Functions, Prev: Charset Properties, Up: Charsets - -Basic Charset Functions ------------------------ - - - Function: find-charset charset-or-name - This function retrieves the charset of the given name. If - CHARSET-OR-NAME is a charset object, it is simply returned. - Otherwise, CHARSET-OR-NAME should be a symbol. If there is no - such charset, `nil' is returned. Otherwise the associated charset - object is returned. - - - Function: get-charset name - This function retrieves the charset of the given name. Same as - `find-charset' except an error is signalled if there is no such - charset instead of returning `nil'. - - - Function: charset-list - This function returns a list of the names of all defined charsets. - - - Function: make-charset name doc-string props - This function defines a new character set. This function is for - use with MULE support. NAME is a symbol, the name by which the - character set is normally referred. DOC-STRING is a string - describing the character set. PROPS is a property list, - describing the specific nature of the character set. The - recognized properties are `registry', `dimension', `columns', - `chars', `final', `graphic', `direction', and `ccl-program', as - previously described. - - - Function: make-reverse-direction-charset charset new-name - This function makes a charset equivalent to CHARSET but which goes - in the opposite direction. NEW-NAME is the name of the new - charset. The new charset is returned. - - - Function: charset-from-attributes dimension chars final &optional - direction - This function returns a charset with the given DIMENSION, CHARS, - FINAL, and DIRECTION. If DIRECTION is omitted, both directions - will be checked (left-to-right will be returned if character sets - exist for both directions). - - - Function: charset-reverse-direction-charset charset - This function returns the charset (if any) with the same dimension, - number of characters, and final byte as CHARSET, but which is - displayed in the opposite direction. +File: lispref.info, Node: Synchronous Interface Functions, Next: Asynchronous Interface Functions, Prev: libpq Lisp Symbols and DataTypes, Up: XEmacs PostgreSQL libpq API + +Synchronous Interface Functions +------------------------------- + + - Function: pq-connectdb conninfo + Establish a (synchronous) database connection. CONNINFO A string + of blank separated options. Options are of the form "OPTION = + VALUE". If VALUE contains blanks, it must be single quoted. + Blanks around the equal sign are optional. Multiple option + assignments are blank separated. + (pq-connectdb "dbname=japanese port = 25432") + => # + The printed representation of a database connection object has four + fields. The first field is the hostname where the database server + is running (in this case localhost), the second field is the port + number, the third field is the database user name, and the fourth + field is the name of the database. + + Database connection objects which have been disconnected and will + generate an immediate error if they are used look like: + # + Bad connections can be reestablished with `pq-reset', or deleted + entirely with `pq-finish'. + + A database connection object that has been deleted looks like: + (let ((P1 (pq-connectdb ""))) + (pq-finish P1) + P1) + => # + + Note that database connection objects are the most heavy weight + objects in XEmacs Lisp at this writing, usually representing as + much as several megabytes of virtual memory on the machine the + database server is running on. It is wisest to explicitly delete + them when you are finished with them, rather than letting garbage + collection do it. An example idiom is: + + (let ((P (pq-connectiondb ""))) + (unwind-protect + (progn + (...)) ; access database here + (pq-finish P))) + + The following options are available in the options string: + `authtype' + Authentication type. Same as PGAUTHTYPE. This is no longer + used. + + `user' + Database user name. Same as PGUSER. + + `password' + Database password. + + `dbname' + Database name. Same as PGDATABASE + + `host' + Symbolic hostname. Same as PGHOST. + + `hostaddr' + Host address as four octets (eg. like 192.168.1.1). + + `port' + TCP port to connect to. Same as PGPORT. + + `tty' + Debugging TTY. Same as PGTTY. This value is suppressed in + the XEmacs Lisp API. + + `options' + Extra backend database options. Same as PGOPTIONS. A + database connection object is returned regardless of whether a + connection was established or not. + + - Function: pq-reset conn + Reestablish database connection. CONN A database connection + object. + + This function reestablishes a database connection using the + original connection parameters. This is useful if something has + happened to the TCP link and it has become broken. + + - Function: pq-exec conn query + Make a synchronous database query. CONN A database connection + object. QUERY A string containing an SQL query. A PGresult + object is returned, which in turn may be queried by its many + accessor functions to retrieve state out of it. If the query + string contains multiple SQL commands, only results from the final + command are returned. + + (setq R (pq-exec P "SELECT * FROM xemacs_test; + DELETE FROM xemacs_test WHERE id=8;")) + => # + + - Function: pq-notifies conn + Return the latest async notification that has not yet been handled. + CONN A database connection object. If there has been a + notification, then a list of two elements will be returned. The + first element contains the relation name being notified, the second + element contains the backend process ID number. nil is returned + if there aren't any notifications to process. + + - Function: PQsetenv conn + Synchronous transfer of environment variables to a backend CONN A + database connection object. + + Environment variable transfer is done as a normal part of database + connection. + + Compatibility note: This function was present but not documented + in versions of libpq prior to 7.0.  -File: lispref.info, Node: Charset Property Functions, Next: Predefined Charsets, Prev: Basic Charset Functions, Up: Charsets +File: lispref.info, Node: Asynchronous Interface Functions, Next: Large Object Support, Prev: Synchronous Interface Functions, Up: XEmacs PostgreSQL libpq API -Charset Property Functions --------------------------- +Asynchronous Interface Functions +-------------------------------- - All of these functions accept either a charset name or charset -object. + Making command by command examples is too complex with the +asynchronous interface functions. See the examples section for +complete calling sequences. - - Function: charset-property charset prop - This function returns property PROP of CHARSET. *Note Charset - Properties::. + - Function: pq-connect-start conninfo + Begin establishing an asynchronous database connection. CONNINFO + A string containing the connection options. See the documentation + of `pq-connectdb' for a listing of all the available flags. - Convenience functions are also provided for retrieving individual -properties of a charset. + - Function: pq-connect-poll conn + An intermediate function to be called during an asynchronous + database connection. CONN A database connection object. The + result codes are documented in a previous section. - - Function: charset-name charset - This function returns the name of CHARSET. This will be a symbol. + - Function: pq-is-busy conn + Returns t if `pq-get-result' would block waiting for input. CONN + A database connection object. - - Function: charset-doc-string charset - This function returns the doc string of CHARSET. + - Function: pq-consume-input conn + Consume any available input from the backend. CONN A database + connection object. - - Function: charset-registry charset - This function returns the registry of CHARSET. + Nil is returned if anything bad happens. - - Function: charset-dimension charset - This function returns the dimension of CHARSET. + - Function: pq-reset-start conn + Reset connection to the backend asynchronously. CONN A database + connection object. - - Function: charset-chars charset - This function returns the number of characters per dimension of - CHARSET. + - Function: pq-reset-poll conn + Poll an asynchronous reset for completion CONN A database + connection object. - - Function: charset-columns charset - This function returns the number of display columns per character - (in TTY mode) of CHARSET. + - Function: pq-reset-cancel conn + Attempt to request cancellation of the current operation. CONN A + database connection object. - - Function: charset-direction charset - This function returns the display direction of CHARSET--either - `l2r' or `r2l'. + The return value is t if the cancel request was successfully + dispatched, nil if not (in which case conn->errorMessage is set). + Note: successful dispatch is no guarantee that there will be any + effect at the backend. The application must read the operation + result as usual. - - Function: charset-final charset - This function returns the final byte of the ISO 2022 escape - sequence designating CHARSET. + - Function: pq-send-query conn query + Submit a query to Postgres and don't wait for the result. CONN A + database connection object. Returns: t if successfully submitted + nil if error (conn->errorMessage is set) - - Function: charset-graphic charset - This function returns either 0 or 1, depending on whether the - position codes of characters in CHARSET map to the left or right - half of their font, respectively. + - Function: pq-get-result conn + Retrieve an asynchronous result from a query. CONN A database + connection object. - - Function: charset-ccl-program charset - This function returns the CCL program, if any, for converting - position codes of characters in CHARSET into font indices. + `nil' is returned when no more query work remains. - The only property of a charset that can currently be set after the -charset has been created is the CCL program. + - Function: pq-set-nonblocking conn arg + Sets the PGconn's database connection non-blocking if the arg is + TRUE or makes it non-blocking if the arg is FALSE, this will not + protect you from PQexec(), you'll only be safe when using the + non-blocking API. CONN A database connection object. - - Function: set-charset-ccl-program charset ccl-program - This function sets the `ccl-program' property of CHARSET to - CCL-PROGRAM. + - Function: pq-is-nonblocking conn + Return the blocking status of the database connection CONN A + database connection object. - -File: lispref.info, Node: Predefined Charsets, Prev: Charset Property Functions, Up: Charsets - -Predefined Charsets -------------------- - - The following charsets are predefined in the C code. - - Name Type Fi Gr Dir Registry - -------------------------------------------------------------- - ascii 94 B 0 l2r ISO8859-1 - control-1 94 0 l2r --- - latin-iso8859-1 94 A 1 l2r ISO8859-1 - latin-iso8859-2 96 B 1 l2r ISO8859-2 - latin-iso8859-3 96 C 1 l2r ISO8859-3 - latin-iso8859-4 96 D 1 l2r ISO8859-4 - cyrillic-iso8859-5 96 L 1 l2r ISO8859-5 - arabic-iso8859-6 96 G 1 r2l ISO8859-6 - greek-iso8859-7 96 F 1 l2r ISO8859-7 - hebrew-iso8859-8 96 H 1 r2l ISO8859-8 - latin-iso8859-9 96 M 1 l2r ISO8859-9 - thai-tis620 96 T 1 l2r TIS620 - katakana-jisx0201 94 I 1 l2r JISX0201.1976 - latin-jisx0201 94 J 0 l2r JISX0201.1976 - japanese-jisx0208-1978 94x94 @ 0 l2r JISX0208.1978 - japanese-jisx0208 94x94 B 0 l2r JISX0208.19(83|90) - japanese-jisx0212 94x94 D 0 l2r JISX0212 - chinese-gb2312 94x94 A 0 l2r GB2312 - chinese-cns11643-1 94x94 G 0 l2r CNS11643.1 - chinese-cns11643-2 94x94 H 0 l2r CNS11643.2 - chinese-big5-1 94x94 0 0 l2r Big5 - chinese-big5-2 94x94 1 0 l2r Big5 - korean-ksc5601 94x94 C 0 l2r KSC5601 - composite 96x96 0 l2r --- - - The following charsets are predefined in the Lisp code. - - Name Type Fi Gr Dir Registry - -------------------------------------------------------------- - arabic-digit 94 2 0 l2r MuleArabic-0 - arabic-1-column 94 3 0 r2l MuleArabic-1 - arabic-2-column 94 4 0 r2l MuleArabic-2 - sisheng 94 0 0 l2r sisheng_cwnn\|OMRON_UDC_ZH - chinese-cns11643-3 94x94 I 0 l2r CNS11643.1 - chinese-cns11643-4 94x94 J 0 l2r CNS11643.1 - chinese-cns11643-5 94x94 K 0 l2r CNS11643.1 - chinese-cns11643-6 94x94 L 0 l2r CNS11643.1 - chinese-cns11643-7 94x94 M 0 l2r CNS11643.1 - ethiopic 94x94 2 0 l2r Ethio - ascii-r2l 94 B 0 r2l ISO8859-1 - ipa 96 0 1 l2r MuleIPA - vietnamese-lower 96 1 1 l2r VISCII1.1 - vietnamese-upper 96 2 1 l2r VISCII1.1 - - For all of the above charsets, the dimension and number of columns -are the same. - - Note that ASCII, Control-1, and Composite are handled specially. -This is why some of the fields are blank; and some of the filled-in -fields (e.g. the type) are not really accurate. + - Function: pq-flush conn + Force the write buffer to be written (or at least try) CONN A + database connection object. + + - Function: PQsetenvStart conn + Start asynchronously passing environment variables to a backend. + CONN A database connection object. + + Compatibility note: this function is only available with libpq-7.0. + + - Function: PQsetenvPoll conn + Check an asynchronous environment variables transfer for + completion. CONN A database connection object. + + Compatibility note: this function is only available with libpq-7.0. + + - Function: PQsetenvAbort conn + Attempt to terminate an asynchronous environment variables + transfer. CONN A database connection object. + + Compatibility note: this function is only available with libpq-7.0.  -File: lispref.info, Node: MULE Characters, Next: Composite Characters, Prev: Charsets, Up: MULE +File: lispref.info, Node: Large Object Support, Next: Other libpq Functions, Prev: Asynchronous Interface Functions, Up: XEmacs PostgreSQL libpq API -MULE Characters -=============== +Large Object Support +-------------------- - - Function: make-char charset arg1 &optional arg2 - This function makes a multi-byte character from CHARSET and octets - ARG1 and ARG2. + - Function: pq-lo-import conn filename + Import a file as a large object into the database. CONN a + database connection object FILENAME filename to import - - Function: char-charset ch - This function returns the character set of char CH. + On success, the object id is returned. - - Function: char-octet ch &optional n - This function returns the octet (i.e. position code) numbered N - (should be 0 or 1) of char CH. N defaults to 0 if omitted. + - Function: pq-lo-export conn oid filename + Copy a large object in the database into a file. CONN a database + connection object. OID object id number of a large object. + FILENAME filename to export to. - - Function: find-charset-region start end &optional buffer - This function returns a list of the charsets in the region between - START and END. BUFFER defaults to the current buffer if omitted. + +File: lispref.info, Node: Other libpq Functions, Next: Unimplemented libpq Functions, Prev: Large Object Support, Up: XEmacs PostgreSQL libpq API + +Other libpq Functions +--------------------- + + - Function: pq-finish conn + Destroy a database connection object by calling free on it. CONN + a database connection object + + It is possible to not call this routine because the usual XEmacs + garbage collection mechanism will call the underlying libpq + routine whenever it is releasing stale `PGconn' objects. However, + this routine is useful in `unwind-protect' clauses to make + connections go away quickly when unrecoverable errors have + occurred. + + After calling this routine, the printed representation of the + XEmacs wrapper object will contain the string "DEAD". + + - Function: pq-client-encoding conn + Return the client encoding as an integer code. CONN a database + connection object + + (pq-client-encoding P) + => 1 + + Compatibility note: This function did not exist prior to libpq-7.0 + and does not exist in a non-Mule XEmacs. + + - Function: pq-set-client-encoding conn encoding + Set client coding system. CONN a database connection object + ENCODING a string representing the desired coding system + + (pq-set-client-encoding P "EUC_JP") + => 0 + + The current idiom for ensuring proper coding system conversion is + the following (illustrated for EUC Japanese encoding): + (setq P (pq-connectdb "...")) + (let ((file-name-coding-system 'euc-jp) + (pg-coding-system 'euc-jp)) + (pq-set-client-encoding "EUC_JP") + ...) + (pq-finish P) + Compatibility note: This function did not exist prior to libpq-7.0 + and does not exist in a non-Mule XEmacs. + + - Function: pq-env-2-encoding + Return the integer code representing the coding system in + PGCLIENTENCODING. + + (pq-env-2-encoding) + => 0 + Compatibility note: This function did not exist prior to libpq-7.0 + and does not exist in a non-Mule XEmacs. + + - Function: pq-clear res + Destroy a query result object by calling free() on it. RES a + query result object + + Note: The memory allocation systems of libpq and XEmacs are + different. The XEmacs representation of a query result object + will have both the XEmacs version and the libpq version freed at + the next garbage collection when the object is no longer being + referenced. Calling this function does not release the XEmacs + object, it is still subject to the usual rules for Lisp objects. + The printed representation of the XEmacs object will contain the + string "DEAD" after this routine is called indicating that it is no + longer useful for anything. + + - Function: pq-conn-defaults + Return a data structure that represents the connection defaults. + The data is returned as a list of lists, where each sublist + contains info regarding a single option. - - Function: find-charset-string string - This function returns a list of the charsets in STRING. + +File: lispref.info, Node: Unimplemented libpq Functions, Prev: Other libpq Functions, Up: XEmacs PostgreSQL libpq API + +Unimplemented libpq Functions +----------------------------- + + - Unimplemented Function: PGconn *PQsetdbLogin (char *pghost, char + *pgport, char *pgoptions, char *pgtty, char *dbName, char + *login, char *pwd) + Synchronous database connection. PGHOST is the hostname of the + PostgreSQL backend to connect to. PGPORT is the TCP port number + to use. PGOPTIONS specifies other backend options. PGTTY + specifies the debugging tty to use. DBNAME specifies the database + name to use. LOGIN specifies the database user name. PWD + specifies the database user's password. + + This routine is deprecated as of libpq-7.0, and its functionality + can be replaced by external Lisp code if needed. + + - Unimplemented Function: PGconn *PQsetdb (char *pghost, char *pgport, + char *pgoptions, char *pgtty, char *dbName) + Synchronous database connection. PGHOST is the hostname of the + PostgreSQL backend to connect to. PGPORT is the TCP port number + to use. PGOPTIONS specifies other backend options. PGTTY + specifies the debugging tty to use. DBNAME specifies the database + name to use. + + This routine was deprecated in libpq-6.5. + + - Unimplemented Function: int PQsocket (PGconn *conn) + Return socket file descriptor to a backend database process. CONN + database connection object. + + - Unimplemented Function: void PQprint (FILE *fout, PGresult *res, + PGprintOpt *ps) + Print out the results of a query to a designated C stream. FOUT C + stream to print to RES the query result object to print PS the + print options structure. + + This routine is deprecated as of libpq-7.0 and cannot be sensibly + exported to XEmacs Lisp. + + - Unimplemented Function: void PQdisplayTuples (PGresult *res, FILE + *fp, int fillAlign, char *fieldSep, int printHeader, int + quiet) + RES query result object to print FP C stream to print to FILLALIGN + pad the fields with spaces FIELDSEP field separator PRINTHEADER + display headers? QUIET + + This routine was deprecated in libpq-6.5. + + - Unimplemented Function: void PQprintTuples (PGresult *res, FILE + *fout, int printAttName, int terseOutput, int width) + RES query result object to print FOUT C stream to print to + PRINTATTNAME print attribute names TERSEOUTPUT delimiter bars + WIDTH width of column, if 0, use variable width + + This routine was deprecated in libpq-6.5. + + - Unimplemented Function: int PQmblen (char *s, int encoding) + Determine length of a multibyte encoded char at `*s'. S encoded + string ENCODING type of encoding + + Compatibility note: This function was introduced in libpq-7.0. + + - Unimplemented Function: void PQtrace (PGconn *conn, FILE *debug_port) + Enable tracing on `debug_port'. CONN database connection object. + DEBUG_PORT C output stream to use. + + - Unimplemented Function: void PQuntrace (PGconn *conn) + Disable tracing. CONN database connection object. + + - Unimplemented Function: char *PQoidStatus (PGconn *conn) + Return the object id as a string of the last tuple inserted. CONN + database connection object. + + Compatibility note: This function is deprecated in libpq-7.0, + however it is used internally by the XEmacs binding code when + linked against versions prior to 7.0. + + - Unimplemented Function: PGresult *PQfn (PGconn *conn, int fnid, int + *result_buf, int *result_len, int result_is_int, PQArgBlock + *args, int nargs) + "Fast path" interface -- not really recommended for application use + CONN A database connection object. FNID RESULT_BUF RESULT_LEN + RESULT_IS_INT ARGS NARGS + + The following set of very low level large object functions aren't +appropriate to be exported to Lisp. + + - Unimplemented Function: int pq-lo-open (PGconn *conn, int lobjid, + int mode) + CONN a database connection object. LOBJID a large object ID. + MODE opening modes. + + - Unimplemented Function: int pq-lo-close (PGconn *conn, int fd) + CONN a database connection object. FD a large object file + descriptor + + - Unimplemented Function: int pq-lo-read (PGconn *conn, int fd, char + *buf, int len) + CONN a database connection object. FD a large object file + descriptor. BUF buffer to read into. LEN size of buffer. + + - Unimplemented Function: int pq-lo-write (PGconn *conn, int fd, char + *buf, size_t len) + CONN a database connection object. FD a large object file + descriptor. BUF buffer to write from. LEN size of buffer. + + - Unimplemented Function: int pq-lo-lseek (PGconn *conn, int fd, int + offset, int whence) + CONN a database connection object. FD a large object file + descriptor. OFFSET WHENCE + + - Unimplemented Function: int pq-lo-creat (PGconn *conn, int mode) + CONN a database connection object. MODE opening modes. + + - Unimplemented Function: int pq-lo-tell (PGconn *conn, int fd) + CONN a database connection object. FD a large object file + descriptor. + + - Unimplemented Function: int pq-lo-unlink (PGconn *conn, int lobjid) + CONN a database connection object. LBOJID a large object ID.  -File: lispref.info, Node: Composite Characters, Next: Coding Systems, Prev: MULE Characters, Up: MULE +File: lispref.info, Node: XEmacs PostgreSQL libpq Examples, Prev: XEmacs PostgreSQL libpq API, Up: PostgreSQL Support -Composite Characters -==================== +XEmacs PostgreSQL libpq Examples +================================ - Composite characters are not yet completely implemented. + This is an example of one method of establishing an asynchronous +connection. + + (defun database-poller (P) + (message "%S before poll" (pq-pgconn P 'pq::status)) + (pq-connect-poll P) + (message "%S after poll" (pq-pgconn P 'pq::status)) + (if (eq (pq-pgconn P 'pq::status) 'pg::connection-ok) + (message "Done!") + (add-timeout .1 'database-poller P))) + => database-poller + (progn + (setq P (pq-connect-start "")) + (add-timeout .1 'database-poller P)) + => pg::connection-started before poll + => pg::connection-made after poll + => pg::connection-made before poll + => pg::connection-awaiting-response after poll + => pg::connection-awaiting-response before poll + => pg::connection-auth-ok after poll + => pg::connection-auth-ok before poll + => pg::connection-setenv after poll + => pg::connection-setenv before poll + => pg::connection-ok after poll + => Done! + P + => # + + Here is an example of one method of doing an asynchronous reset. + + (defun database-poller (P) + (let (PS) + (message "%S before poll" (pq-pgconn P 'pq::status)) + (setq PS (pq-reset-poll P)) + (message "%S after poll [%S]" (pq-pgconn P 'pq::status) PS) + (if (eq (pq-pgconn P 'pq::status) 'pg::connection-ok) + (message "Done!") + (add-timeout .1 'database-poller P)))) + => database-poller + (progn + (pq-reset-start P) + (add-timeout .1 'database-poller P)) + => pg::connection-started before poll + => pg::connection-made after poll [pgres::polling-writing] + => pg::connection-made before poll + => pg::connection-awaiting-response after poll [pgres::polling-reading] + => pg::connection-awaiting-response before poll + => pg::connection-setenv after poll [pgres::polling-reading] + => pg::connection-setenv before poll + => pg::connection-ok after poll [pgres::polling-ok] + => Done! + P + => # + + And finally, an asynchronous query. + + (defun database-poller (P) + (let (R) + (pq-consume-input P) + (if (pq-is-busy P) + (add-timeout .1 'database-poller P) + (setq R (pq-get-result P)) + (if R + (progn + (push R result-list) + (add-timeout .1 'database-poller P)))))) + => database-poller + (when (pq-send-query P "SELECT * FROM xemacs_test;") + (setq result-list nil) + (add-timeout .1 'database-poller P)) + => 885 + ;; wait a moment + result-list + => (#) + + Here is an example showing how multiple SQL statements in a single +query can have all their results collected. + ;; Using the same `database-poller' function from the previous example + (when (pq-send-query P "SELECT * FROM xemacs_test; + SELECT * FROM pg_database; + SELECT * FROM pg_user;") + (setq result-list nil) + (add-timeout .1 'database-poller P)) + => 1782 + ;; wait a moment + result-list + => (# # #) + + Here is an example which illustrates collecting all data from a +query, including the field names. + + (defun pg-util-query-results (results) + "Retrieve results of last SQL query into a list structure." + (let ((i (1- (pq-ntuples R))) + j l1 l2) + (while (>= i 0) + (setq j (1- (pq-nfields R))) + (setq l2 nil) + (while (>= j 0) + (push (pq-get-value R i j) l2) + (decf j)) + (push l2 l1) + (decf i)) + (setq j (1- (pq-nfields R))) + (setq l2 nil) + (while (>= j 0) + (push (pq-fname R j) l2) + (decf j)) + (push l2 l1) + l1)) + => pg-util-query-results + (setq R (pq-exec P "SELECT * FROM xemacs_test ORDER BY field2 DESC;")) + => # + (pg-util-query-results R) + => (("f1" "field2") ("a" "97") ("b" "97") ("stuff" "42") ("a string" "12") ("foo" "10") ("string" "2") ("text" "1")) + + Here is an example of a query that uses a database cursor. + + (let (data R) + (setq R (pq-exec P "BEGIN;")) + (setq R (pq-exec P "DECLARE k_cursor CURSOR FOR SELECT * FROM xemacs_test ORDER BY f1 DESC;")) + + (setq R (pq-exec P "FETCH k_cursor;")) + (while (eq (pq-ntuples R) 1) + (push (list (pq-get-value R 0 0) (pq-get-value R 0 1)) data) + (setq R (pq-exec P "FETCH k_cursor;"))) + (setq R (pq-exec P "END;")) + data) + => (("a" "97") ("a string" "12") ("b" "97") ("foo" "10") ("string" "2") ("stuff" "42") ("text" "1")) + + Here's another example of cursors, this time with a Lisp macro to +implement a mapping function over a table. + + (defmacro map-db (P table condition callout) + `(let (R) + (pq-exec ,P "BEGIN;") + (pq-exec ,P (concat "DECLARE k_cursor CURSOR FOR SELECT * FROM " + ,table + " " + ,condition + " ORDER BY f1 DESC;")) + (setq R (pq-exec P "FETCH k_cursor;")) + (while (eq (pq-ntuples R) 1) + (,callout (pq-get-value R 0 0) (pq-get-value R 0 1)) + (setq R (pq-exec P "FETCH k_cursor;"))) + (pq-exec P "END;"))) + => map-db + (defun callback (arg1 arg2) + (message "arg1 = %s, arg2 = %s" arg1 arg2)) + => callback + (map-db P "xemacs_test" "WHERE field2 > 10" callback) + => arg1 = stuff, arg2 = 42 + => arg1 = b, arg2 = 97 + => arg1 = a string, arg2 = 12 + => arg1 = a, arg2 = 97 + => # - - Function: make-composite-char string - This function converts a string into a single composite character. - The character is the result of overstriking all the characters in - the string. + +File: lispref.info, Node: Internationalization, Next: MULE, Prev: PostgreSQL Support, Up: Top - - Function: composite-char-string ch - This function returns a string of the characters comprising a - composite character. +Internationalization +******************** - - Function: compose-region start end &optional buffer - This function composes the characters in the region from START to - END in BUFFER into one composite character. The composite - character replaces the composed characters. BUFFER defaults to - the current buffer if omitted. +* Menu: - - Function: decompose-region start end &optional buffer - This function decomposes any composite characters in the region - from START to END in BUFFER. This converts each composite - character into one or more characters, the individual characters - out of which the composite character was formed. Non-composite - characters are left as-is. BUFFER defaults to the current buffer - if omitted. +* I18N Levels 1 and 2:: Support for different time, date, and currency formats. +* I18N Level 3:: Support for localized messages. +* I18N Level 4:: Support for Asian languages.  -File: lispref.info, Node: Coding Systems, Next: CCL, Prev: Composite Characters, Up: MULE +File: lispref.info, Node: I18N Levels 1 and 2, Next: I18N Level 3, Up: Internationalization -Coding Systems -============== +I18N Levels 1 and 2 +=================== - A coding system is an object that defines how text containing -multiple character sets is encoded into a stream of (typically 8-bit) -bytes. The coding system is used to decode the stream into a series of -characters (which may be from multiple charsets) when the text is read -from a file or process, and is used to encode the text back into the -same format when it is written out to a file or process. + XEmacs is now compliant with I18N levels 1 and 2. Specifically, +this means that it is 8-bit clean and correctly handles time and date +functions. XEmacs will correctly display the entire ISO-Latin 1 +character set. - For example, many ISO-2022-compliant coding systems (such as Compound -Text, which is used for inter-client data under the X Window System) use -escape sequences to switch between different charsets - Japanese Kanji, -for example, is invoked with `ESC $ ( B'; ASCII is invoked with `ESC ( -B'; and Cyrillic is invoked with `ESC - L'. See `make-coding-system' -for more information. + The compose key may now be used to create any character in the +ISO-Latin 1 character set not directly available via the keyboard.. In +order for the compose key to work it is necessary to load the file +`x-compose.el'. At any time while composing a character, `C-h' will +display all valid completions and the character which would be produced. - Coding systems are normally identified using a symbol, and the -symbol is accepted in place of the actual coding system object whenever -a coding system is called for. (This is similar to how faces and -charsets work.) + +File: lispref.info, Node: I18N Level 3, Next: I18N Level 4, Prev: I18N Levels 1 and 2, Up: Internationalization - - Function: coding-system-p object - This function returns non-`nil' if OBJECT is a coding system. +I18N Level 3 +============ * Menu: -* Coding System Types:: Classifying coding systems. -* ISO 2022:: An international standard for - charsets and encodings. -* EOL Conversion:: Dealing with different ways of denoting - the end of a line. -* Coding System Properties:: Properties of a coding system. -* Basic Coding System Functions:: Working with coding systems. -* Coding System Property Functions:: Retrieving a coding system's properties. -* Encoding and Decoding Text:: Encoding and decoding text. -* Detection of Textual Encoding:: Determining how text is encoded. -* Big5 and Shift-JIS Functions:: Special functions for these non-standard - encodings. -* Predefined Coding Systems:: Coding systems implemented by MULE. +* Level 3 Basics:: +* Level 3 Primitives:: +* Dynamic Messaging:: +* Domain Specification:: +* Documentation String Extraction::  -File: lispref.info, Node: Coding System Types, Next: ISO 2022, Up: Coding Systems - -Coding System Types -------------------- - - The coding system type determines the basic algorithm XEmacs will -use to decode or encode a data stream. Character encodings will be -converted to the MULE encoding, escape sequences processed, and newline -sequences converted to XEmacs's internal representation. There are -three basic classes of coding system type: no-conversion, ISO-2022, and -special. - - No conversion allows you to look at the file's internal -representation. Since XEmacs is basically a text editor, "no -conversion" does convert newline conventions by default. (Use the -'binary coding-system if this is not desired.) - - ISO 2022 (*note ISO 2022::) is the basic international standard -regulating use of "coded character sets for the exchange of data", ie, -text streams. ISO 2022 contains functions that make it possible to -encode text streams to comply with restrictions of the Internet mail -system and de facto restrictions of most file systems (eg, use of the -separator character in file names). Coding systems which are not ISO -2022 conformant can be difficult to handle. Perhaps more important, -they are not adaptable to multilingual information interchange, with -the obvious exception of ISO 10646 (Unicode). (Unicode is partially -supported by XEmacs with the addition of the Lisp package ucs-conv.) - - The special class of coding systems includes automatic detection, -CCL (a "little language" embedded as an interpreter, useful for -translating between variants of a single character set), -non-ISO-2022-conformant encodings like Unicode, Shift JIS, and Big5, -and MULE internal coding. (NB: this list is based on XEmacs 21.2. -Terminology may vary slightly for other versions of XEmacs and for GNU -Emacs 20.) - -`no-conversion' - No conversion, for binary files, and a few special cases of - non-ISO-2022 coding systems where conversion is done by hook - functions (usually implemented in CCL). On output, graphic - characters that are not in ASCII or Latin-1 will be replaced by a - `?'. (For a no-conversion-encoded buffer, these characters will - only be present if you explicitly insert them.) - -`iso2022' - Any ISO-2022-compliant encoding. Among others, this includes JIS - (the Japanese encoding commonly used for e-mail), national - variants of EUC (the standard Unix encoding for Japanese and other - languages), and Compound Text (an encoding used in X11). You can - specify more specific information about the conversion with the - FLAGS argument. - -`ucs-4' - ISO 10646 UCS-4 encoding. A 31-bit fixed-width superset of - Unicode. - -`utf-8' - ISO 10646 UTF-8 encoding. A "file system safe" transformation - format that can be used with both UCS-4 and Unicode. - -`undecided' - Automatic conversion. XEmacs attempts to detect the coding system - used in the file. - -`shift-jis' - Shift-JIS (a Japanese encoding commonly used in PC operating - systems). - -`big5' - Big5 (the encoding commonly used for Taiwanese). - -`ccl' - The conversion is performed using a user-written pseudo-code - program. CCL (Code Conversion Language) is the name of this - pseudo-code. For example, CCL is used to map KOI8-R characters - (an encoding for Russian Cyrillic) to ISO8859-5 (the form used - internally by MULE). - -`internal' - Write out or read in the raw contents of the memory representing - the buffer's text. This is primarily useful for debugging - purposes, and is only enabled when XEmacs has been compiled with - `DEBUG_XEMACS' set (the `--debug' configure option). *Warning*: - Reading in a file using `internal' conversion can result in an - internal inconsistency in the memory representing a buffer's text, - which will produce unpredictable results and may cause XEmacs to - crash. Under normal circumstances you should never use `internal' - conversion. +File: lispref.info, Node: Level 3 Basics, Next: Level 3 Primitives, Up: I18N Level 3 + +Level 3 Basics +-------------- + + XEmacs now provides alpha-level functionality for I18N Level 3. +This means that everything necessary for full messaging is available, +but not every file has been converted. + + The two message files which have been created are `src/emacs.po' and +`lisp/packages/mh-e.po'. Both files need to be converted using +`msgfmt', and the resulting `.mo' files placed in some locale's +`LC_MESSAGES' directory. The test "translations" in these files are +the original messages prefixed by `TRNSLT_'. + + The domain for a variable is stored on the variable's property list +under the property name VARIABLE-DOMAIN. The function +`documentation-property' uses this information when translating a +variable's documentation.  -File: lispref.info, Node: ISO 2022, Next: EOL Conversion, Prev: Coding System Types, Up: Coding Systems - -ISO 2022 -======== - - This section briefly describes the ISO 2022 encoding standard. A -more thorough treatment is available in the original document of ISO -2022 as well as various national standards (such as JIS X 0202). - - Character sets ("charsets") are classified into the following four -categories, according to the number of characters in the charset: -94-charset, 96-charset, 94x94-charset, and 96x96-charset. This means -that although an ISO 2022 coding system may have variable width -characters, each charset used is fixed-width (in contrast to the MULE -character set and UTF-8, for example). - - ISO 2022 provides for switching between character sets via escape -sequences. This switching is somewhat complicated, because ISO 2022 -provides for both legacy applications like Internet mail that accept -only 7 significant bits in some contexts (RFC 822 headers, for example), -and more modern "8-bit clean" applications. It also provides for -compact and transparent representation of languages like Japanese which -mix ASCII and a national script (even outside of computer programs). - - First, ISO 2022 codified prevailing practice by dividing the code -space into "control" and "graphic" regions. The code points 0x00-0x1F -and 0x80-0x9F are reserved for "control characters", while "graphic -characters" must be assigned to code points in the regions 0x20-0x7F and -0xA0-0xFF. The positions 0x20 and 0x7F are special, and under some -circumstances must be assigned the graphic character "ASCII SPACE" and -the control character "ASCII DEL" respectively. - - The various regions are given the name C0 (0x00-0x1F), GL -(0x20-0x7F), C1 (0x80-0x9F), and GR (0xA0-0xFF). GL and GR stand for -"graphic left" and "graphic right", respectively, because of the -standard method of displaying graphic character sets in tables with the -high byte indexing columns and the low byte indexing rows. I don't -find it very intuitive, but these are called "registers". - - An ISO 2022-conformant encoding for a graphic character set must use -a fixed number of bytes per character, and the values must fit into a -single register; that is, each byte must range over either 0x20-0x7F, or -0xA0-0xFF. It is not allowed to extend the range of the repertoire of a -character set by using both ranges at the same. This is why a standard -character set such as ISO 8859-1 is actually considered by ISO 2022 to -be an aggregation of two character sets, ASCII and LATIN-1, and why it -is technically incorrect to refer to ISO 8859-1 as "Latin 1". Also, a -single character's bytes must all be drawn from the same register; this -is why Shift JIS (for Japanese) and Big 5 (for Chinese) are not ISO -2022-compatible encodings. - - The reason for this restriction becomes clear when you attempt to -define an efficient, robust encoding for a language like Japanese. -Like ISO 8859, Japanese encodings are aggregations of several character -sets. In practice, the vast majority of characters are drawn from the -"JIS Roman" character set (a derivative of ASCII; it won't hurt to -think of it as ASCII) and the JIS X 0208 standard "basic Japanese" -character set including not only ideographic characters ("kanji") but -syllabic Japanese characters ("kana"), a wide variety of symbols, and -many alphabetic characters (Roman, Greek, and Cyrillic) as well. -Although JIS X 0208 includes the whole Roman alphabet, as a 2-byte code -it is not suited to programming; thus the inclusion of ASCII in the -standard Japanese encodings. - - For normal Japanese text such as in newspapers, a broad repertoire of -approximately 3000 characters is used. Evidently this won't fit into -one byte; two must be used. But much of the text processed by Japanese -computers is computer source code, nearly all of which is ASCII. A not -insignificant portion of ordinary text is English (as such or as -borrowed Japanese vocabulary) or other languages which can represented -at least approximately in ASCII, as well. It seems reasonable then to -represent ASCII in one byte, and JIS X 0208 in two. And this is exactly -what the Extended Unix Code for Japanese (EUC-JP) does. ASCII is -invoked to the GL register, and JIS X 0208 is invoked to the GR -register. Thus, each byte can be tested for its character set by -looking at the high bit; if set, it is Japanese, if clear, it is ASCII. -Furthermore, since control characters like newline can never be part of -a graphic character, even in the case of corruption in transmission the -stream will be resynchronized at every line break, on the order of 60-80 -bytes. This coding system requires no escape sequences or special -control codes to represent 99.9% of all Japanese text. - - Note carefully the distinction between the character sets (ASCII and -JIS X 0208), the encoding (EUC-JP), and the coding system (ISO 2022). -The JIS X 0208 character set is used in three different encodings for -Japanese, but in ISO-2022-JP it is invoked into GL (so the high bit is -always clear), in EUC-JP it is invoked into GR (setting the high bit in -the process), and in Shift JIS the high bit may be set or reset, and the -significant bits are shifted within the 16-bit character so that the two -main character sets can coexist with a third (the "halfwidth katakana" -of JIS X 0201). As the name implies, the ISO-2022-JP encoding is also a -version of the ISO-2022 coding system. - - In order to systematically treat subsidiary character sets (like the -"halfwidth katakana" already mentioned, and the "supplementary kanji" of -JIS X 0212), four further registers are defined: G0, G1, G2, and G3. -Unlike GL and GR, they are not logically distinguished by internal -format. Instead, the process of "invocation" mentioned earlier is -broken into two steps: first, a character set is "designated" to one of -the registers G0-G3 by use of an "escape sequence" of the form: - - ESC [I] I F - - where I is an intermediate character or characters in the range 0x20 -- 0x3F, and F, from the range 0x30-0x7Fm is the final character -identifying this charset. (Final characters in the range 0x30-0x3F are -reserved for private use and will never have a publically registered -meaning.) - - Then that register is "invoked" to either GL or GR, either -automatically (designations to G0 normally involve invocation to GL as -well), or by use of shifting (affecting only the following character in -the data stream) or locking (effective until the next designation or -locking) control sequences. An encoding conformant to ISO 2022 is -typically defined by designating the initial contents of the G0-G3 -registers, specifying an 7 or 8 bit environment, and specifying whether -further designations will be recognized. - - Some examples of character sets and the registered final characters -F used to designate them: - -94-charset - ASCII (B), left (J) and right (I) half of JIS X 0201, ... - -96-charset - Latin-1 (A), Latin-2 (B), Latin-3 (C), ... - -94x94-charset - GB2312 (A), JIS X 0208 (B), KSC5601 (C), ... - -96x96-charset - none for the moment - - The meanings of the various characters in these sequences, where not -specified by the ISO 2022 standard (such as the ESC character), are -assigned by "ECMA", the European Computer Manufacturers Association. - - The meaning of intermediate characters are: - - $ [0x24]: indicate charset of dimension 2 (94x94 or 96x96). - ( [0x28]: designate to G0 a 94-charset whose final byte is F. - ) [0x29]: designate to G1 a 94-charset whose final byte is F. - * [0x2A]: designate to G2 a 94-charset whose final byte is F. - + [0x2B]: designate to G3 a 94-charset whose final byte is F. - , [0x2C]: designate to G0 a 96-charset whose final byte is F. - - [0x2D]: designate to G1 a 96-charset whose final byte is F. - . [0x2E]: designate to G2 a 96-charset whose final byte is F. - / [0x2F]: designate to G3 a 96-charset whose final byte is F. - - The comma may be used in files read and written only by MULE, as a -MULE extension, but this is illegal in ISO 2022. (The reason is that -in ISO 2022 G0 must be a 94-member character set, with 0x20 assigned -the value SPACE, and 0x7F assigned the value DEL.) - - Here are examples of designations: - - ESC ( B : designate to G0 ASCII - ESC - A : designate to G1 Latin-1 - ESC $ ( A or ESC $ A : designate to G0 GB2312 - ESC $ ( B or ESC $ B : designate to G0 JISX0208 - ESC $ ) C : designate to G1 KSC5601 - - (The short forms used to designate GB2312 and JIS X 0208 are for -backwards compatibility; the long forms are preferred.) - - To use a charset designated to G2 or G3, and to use a charset -designated to G1 in a 7-bit environment, you must explicitly invoke G1, -G2, or G3 into GL. There are two types of invocation, Locking Shift -(forever) and Single Shift (one character only). - - Locking Shift is done as follows: - - LS0 or SI (0x0F): invoke G0 into GL - LS1 or SO (0x0E): invoke G1 into GL - LS2: invoke G2 into GL - LS3: invoke G3 into GL - LS1R: invoke G1 into GR - LS2R: invoke G2 into GR - LS3R: invoke G3 into GR - - Single Shift is done as follows: - - SS2 or ESC N: invoke G2 into GL - SS3 or ESC O: invoke G3 into GL - - The shift functions (such as LS1R and SS3) are represented by control -characters (from C1) in 8 bit environments and by escape sequences in 7 -bit environments. +File: lispref.info, Node: Level 3 Primitives, Next: Dynamic Messaging, Prev: Level 3 Basics, Up: I18N Level 3 + +Level 3 Primitives +------------------ - (#### Ben says: I think the above is slightly incorrect. It appears -that SS2 invokes G2 into GR and SS3 invokes G3 into GR, whereas ESC N -and ESC O behave as indicated. The above definitions will not parse -EUC-encoded text correctly, and it looks like the code in mule-coding.c -has similar problems.) + - Function: gettext string + This function looks up STRING in the default message domain and + returns its translation. If `I18N3' was not enabled when XEmacs + was compiled, it just returns STRING. - Evidently there are a lot of ISO-2022-compliant ways of encoding -multilingual text. Now, in the world, there exist many coding systems -such as X11's Compound Text, Japanese JUNET code, and so-called EUC -(Extended UNIX Code); all of these are variants of ISO 2022. + - Function: dgettext domain string + This function looks up STRING in the specified message domain and + returns its translation. If `I18N3' was not enabled when XEmacs + was compiled, it just returns STRING. - In MULE, we characterize a version of ISO 2022 by the following -attributes: + - Function: bind-text-domain domain pathname + This function associates a pathname with a message domain. Here's + how the path to message file is constructed under SunOS 5.x: - 1. The character sets initially designated to G0 thru G3. + `{pathname}/{LANG}/LC_MESSAGES/{domain}.mo' - 2. Whether short form designations are allowed for Japanese and - Chinese. + If `I18N3' was not enabled when XEmacs was compiled, this function + does nothing. - 3. Whether ASCII should be designated to G0 before control characters. + - Special Form: domain string + This function specifies the text domain used for translating + documentation strings and interactive prompts of a function. For + example, write: - 4. Whether ASCII should be designated to G0 at the end of line. + (defun foo (arg) "Doc string" (domain "emacs-foo") ...) - 5. 7-bit environment or 8-bit environment. + to specify `emacs-foo' as the text domain of the function `foo'. + The "call" to `domain' is actually a declaration rather than a + function; when actually called, `domain' just returns `nil'. - 6. Whether Locking Shifts are used or not. + - Function: domain-of function + This function returns the text domain of FUNCTION; it returns + `nil' if it is the default domain. If `I18N3' was not enabled + when XEmacs was compiled, it always returns `nil'. - 7. Whether to use ASCII or the variant JIS X 0201-1976-Roman. + +File: lispref.info, Node: Dynamic Messaging, Next: Domain Specification, Prev: Level 3 Primitives, Up: I18N Level 3 - 8. Whether to use JIS X 0208-1983 or the older version JIS X - 0208-1976. +Dynamic Messaging +----------------- - (The last two are only for Japanese.) + The `format' function has been extended to permit you to change the +order of parameter insertion. For example, the conversion format +`%1$s' inserts parameter one as a string, while `%2$s' inserts +parameter two. This is useful when creating translations which require +you to change the word order. - By specifying these attributes, you can create any variant of ISO -2022. + +File: lispref.info, Node: Domain Specification, Next: Documentation String Extraction, Prev: Dynamic Messaging, Up: I18N Level 3 - Here are several examples: +Domain Specification +-------------------- - ISO-2022-JP -- Coding system used in Japanese email (RFC 1463 #### check). - 1. G0 <- ASCII, G1..3 <- never used - 2. Yes. - 3. Yes. - 4. Yes. - 5. 7-bit environment - 6. No. - 7. Use ASCII - 8. Use JIS X 0208-1983 - - ctext -- X11 Compound Text - 1. G0 <- ASCII, G1 <- Latin-1, G2,3 <- never used. - 2. No. - 3. No. - 4. Yes. - 5. 8-bit environment. - 6. No. - 7. Use ASCII. - 8. Use JIS X 0208-1983. - - euc-china -- Chinese EUC. Often called the "GB encoding", but that is - technically incorrect. - 1. G0 <- ASCII, G1 <- GB 2312, G2,3 <- never used. - 2. No. - 3. Yes. - 4. Yes. - 5. 8-bit environment. - 6. No. - 7. Use ASCII. - 8. Use JIS X 0208-1983. - - ISO-2022-KR -- Coding system used in Korean email. - 1. G0 <- ASCII, G1 <- KSC 5601, G2,3 <- never used. - 2. No. - 3. Yes. - 4. Yes. - 5. 7-bit environment. - 6. Yes. - 7. Use ASCII. - 8. Use JIS X 0208-1983. - - MULE creates all of these coding systems by default. + The default message domain of XEmacs is `emacs'. For add-on +packages, it is best to use a different domain. For example, let us +say we want to convert the "gorilla" package to use the domain +`emacs-gorilla'. To translate the message "What gorilla?", use +`dgettext' as follows: + + (dgettext "emacs-gorilla" "What gorilla?") + + A function (or macro) which has a documentation string or an +interactive prompt needs to be associated with the domain in order for +the documentation or prompt to be translated. This is done with the +`domain' special form as follows: + + (defun scratch (location) + "Scratch the specified location." + (domain "emacs-gorilla") + (interactive "sScratch: ") + ... ) + + It is most efficient to specify the domain in the first line of the +function body, before the `interactive' form. + + For variables and constants which have documentation strings, +specify the domain after the documentation. + + - Special Form: defvar symbol [value [doc-string [domain]]] + Example: + (defvar weight 250 "Weight of gorilla, in pounds." "emacs-gorilla") + + - Special Form: defconst symbol [value [doc-string [domain]]] + Example: + (defconst limbs 4 "Number of limbs" "emacs-gorilla") + + - Function: autoload function filename &optional docstring interactive + type + This function defines FUNCTION to autoload from FILENAME Example: + (autoload 'explore "jungle" "Explore the jungle." nil nil "emacs-gorilla")  -File: lispref.info, Node: EOL Conversion, Next: Coding System Properties, Prev: ISO 2022, Up: Coding Systems +File: lispref.info, Node: Documentation String Extraction, Prev: Domain Specification, Up: I18N Level 3 -EOL Conversion --------------- +Documentation String Extraction +------------------------------- + + The utility `etc/make-po' scans the file `DOC' to extract +documentation strings and creates a message file `doc.po'. This file +may then be inserted within `emacs.po'. + + Currently, `make-po' is hard-coded to read from `DOC' and write to +`doc.po'. In order to extract documentation strings from an add-on +package, first run `make-docfile' on the package to produce the `DOC' +file. Then run `make-po -p' with the `-p' argument to indicate that we +are extracting documentation for an add-on package. + + (The `-p' argument is a kludge to make up for a subtle difference +between pre-loaded documentation and add-on documentation: For add-on +packages, the final carriage returns in the strings produced by +`make-docfile' must be ignored.) + + +File: lispref.info, Node: I18N Level 4, Prev: I18N Level 3, Up: Internationalization + +I18N Level 4 +============ + + The Asian-language support in XEmacs is called "MULE". *Note MULE::. + + +File: lispref.info, Node: MULE, Next: Tips, Prev: Internationalization, Up: Top + +MULE +**** + + "MULE" is the name originally given to the version of GNU Emacs +extended for multi-lingual (and in particular Asian-language) support. +"MULE" is short for "MUlti-Lingual Emacs". It is an extension and +complete rewrite of Nemacs ("Nihon Emacs" where "Nihon" is the Japanese +word for "Japan"), which only provided support for Japanese. XEmacs +refers to its multi-lingual support as "MULE support" since it is based +on "MULE". + +* Menu: -`nil' - Automatically detect the end-of-line type (LF, CRLF, or CR). Also - generate subsidiary coding systems named `NAME-unix', `NAME-dos', - and `NAME-mac', that are identical to this coding system but have - an EOL-TYPE value of `lf', `crlf', and `cr', respectively. - -`lf' - The end of a line is marked externally using ASCII LF. Since this - is also the way that XEmacs represents an end-of-line internally, - specifying this option results in no end-of-line conversion. This - is the standard format for Unix text files. - -`crlf' - The end of a line is marked externally using ASCII CRLF. This is - the standard format for MS-DOS text files. - -`cr' - The end of a line is marked externally using ASCII CR. This is the - standard format for Macintosh text files. - -`t' - Automatically detect the end-of-line type but do not generate - subsidiary coding systems. (This value is converted to `nil' when - stored internally, and `coding-system-property' will return `nil'.) +* Internationalization Terminology:: + Definition of various internationalization terms. +* Charsets:: Sets of related characters. +* MULE Characters:: Working with characters in XEmacs/MULE. +* Composite Characters:: Making new characters by overstriking other ones. +* Coding Systems:: Ways of representing a string of chars using integers. +* CCL:: A special language for writing fast converters. +* Category Tables:: Subdividing charsets into groups.