/* Copyright (C) 2003, 2004
National Institute of Advanced Industrial Science and Technology (AIST)
Registration Number H15PRO112
See the end for copying conditions. */
/***en
@page mdbGeneral General Format
@section general-description DESCRIPTION
The mdatabase_load () function returns the data specified by tags in
the form of plist if the first tag is not @c Mchartable nor @c
Mcharset. The keys of the returned plist are limited to
Minteger, Msymbol, Mtext, and
Mplist. The type of the value is unambiguously determined by
the corresponding key. If the key is Minteger, the value is
an integer. If the key is Msymbol, the value is a symbol.
And so on.
A number of expressions are possible to represent a plist. For
instance, we can use the form (K1:V1, K2:V2, ..., Kn:Vn) to
represent a plist whose first property key and value are K1 and V1,
second key and value are K2 and V2, and so on. However, we can use a
simpler expression here because the types of plists used in the m17n
database are fairly restricted.
Hereafter, we use an expression, which is similar to S-expression, to
represent a plist. (Actually, the default database loader of the m17n
library is designed to read data files written in this expression.)
The expression consists of one or more elements. Each element
represents a property, i.e. a single element of a plist.
Elements are separated by one or more whitespaces, i.e. a space
(code 32), a tab (code 9), or a newline (code 10). Comments begin
with a semicolon (;) and extend to the end of the line.
The key and the value of each property are determined based on the
type of the element as explained below.
- INTEGER
An element that matches the regular expression -?[0-9]+ or
0[xX][0-9A-Fa-f]+ represents a property whose key is
Minteger. An element matching the former expression is
interpreted as an integer in decimal notation, and one matching the
latter is interpreted as an integer in hexadecimal notation. The
value of the property is the result of interpretation.
For instance, the element 0xA0 represents a property whose
value is 160 in decimal.
- SYMBOL
An element that matches the regular expression
[^-0-9(]([^\\()]|\\.)+ represents a property whose key is
Msymbol. In the element, \\t, \\n,
\\r, and \\e are replaced with tab (code 9), newline
(code 10), carriage return (code 13), and escape (code 27)
respectively. Other characters following a backslash is interpreted
as it is. The value of the property is the symbol having the
resulting string as its name.
For instance, the element abc\ def represents a property
whose value is the symbol having the name "abc def".
- MTEXT
An element that matches the regular expression "([^"]|\\")*"
represents a property whose key is Mtext. The backslash
escape explained above also applies here. Moreover, each part in the
element matching the regular expression
\\[xX][0-9A-Fa-f][0-9A-Fa-f] is replaced with its hexadecimal
interpretation.
After having resolved the backslash escapes, the byte sequence between
the double quotes is interpreted as a UTF-8 sequence and decoded into
an M-text. This M-text is the value of the property.
- PLIST
Zero or more elements surrounded by a pair of parentheses represent a
property whose key is Mplist. Whitespaces before and after a
parenthesis can be omitted. The value of the property is a plist,
which is the result of recursive interpretation of the elements
between the parentheses.
@section general-syntax SYNTAX NOTATION
In an explanation of a plist format of data, a BNF-like notation is
used. In the notation, non-terminals are represented by a string of
uppercase letters (including '-' in the middle), terminals are
represented by a string surrounded by '"'. Special non-terminals
INTEGER, SYMBOL, MTEXT and PLIST represents property integer, symbol,
M-text, or plist respectively.
@section general-example EXAMPLE
Here is an example of database data that is read into a plist of this
simple format:
@verbatim
DATA-FORMAT ::=
[ INTEGER | SYMBOL | MTEXT | FUNC ] *
FUNC ::=
'(' FUNC-NAME FUNC-ARG * ')'
FUNC-NAME ::=
SYMBOL
FUNC-ARG ::=
INTEGER | SYMBOL | MTEXT | '(' FUNC-ARG ')'
@endverbatim
For instance, a data file that contains this text matches the above
syntax:
@verbatim
abc 123 (pqr 0xff) "m\"text" (_\\_ ("string" xyz) -456)
@endverbatim
and is read into this plist:
@verbatim
1st element: key: Msymbol, value: abc
2nd element: key: Minteger, value: 123
3rd element: key: Mplist, value: a plist of these elements:
1st element: key Msymbol, value: pgr
2nd element: key Minteger, value: 255
4th element: key: Mtext, value: m"text
5th element: key: Mplist, value: a plist of these elements:
1st element: key: Msymbol, value: _\_
2nd element: key: Mplist, value: a plist of these elements:
1st element: key: Mtext, value: string
2nd element: key: Msymbol, value: xyz
3rd element: key: Minteger, value: -456
@endverbatim
*/
/*
Copyright (C) 2003, 2004
National Institute of Advanced Industrial Science and Technology (AIST)
Registration Number H15PRO112
This file is part of the m17n database; a sub-part of the m17n
library.
The m17n library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1 of
the License, or (at your option) any later version.
The m17n library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the m17n library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307, USA.
*/