/* Copyright (C) 2003, 2004
National Institute of Advanced Industrial Science and Technology (AIST)
Registration Number H15PRO112
See the end for copying conditions. */
/***en
@page mdbIM Input Method
@section im-description DESCRIPTION
The m17n library provides a driver for input methods that are
dynamically loadable from the m17n database (see @ref m17nInputMethod
@latexonly (P.\pageref{group__m17nInputMethod}) @endlatexonly).
This section describes the data format that defines those input
methods.
@section im-format SYNTAX and SEMANTICS
The following data format defines an input method. The driver loads
a definition from a file, a stream, etc. A definitions is converted
into the form of plist in the driver.
@verbatim
INPUT-METHOD ::= TITLE MAP-LIST MACRO-LIST ? MODULE-LIST ? STATE-LIST
TITLE ::= '(' 'title' MTEXT ')'
@endverbatim
@c MTEXT is a text displayed on the screen when this input method is
active.
@verbatim
MAP-LIST ::= '(' 'map' MAP * ')'
MAP ::= '(' MAP-NAME RULE * ')'
MAP-NAME ::= SYMBOL
RULE ::= '(' KEYSEQ MAP-ACTION * ')'
KEYSEQ ::= MTEXT | '(' [ SYMBOL | INTEGER ] * ')'
@endverbatim
@c SYMBOL in the definitions of @c MAP-NAME must not be @c t nor @c
nil.
@c MTEXT in the definition of @c KEYSEQ consists of characters that
can be generated by a keyboard. Therefore @c MTEXT usually contains
only ASCII characters. However, if the input method is intended to be
used, for instance, with a West European keyboard, @c MTEXT may
contain Latin-1 characters.
@c SYMBOL in the definition of @c KEYSEQ must be the return value of
the minput_event_to_key () function.
@c INTEGER in the definition of @c KEYSEQ must be a valid character
code.
@verbatim
MAP-ACTION ::= ACTION
ACTION ::= INSERT | DELETE | SELECT | MOVE | MARK |
| SHOW | HIDE | PUSHBACK | UNDO | SHIFT | CALL
| SET | IF | '(' MACRO-NAME ')'
PREDEFINED-SYMBOL ::=
'@<' | '@=' | '@>' | '@-' | '@+' | '@[' | '@]'
@endverbatim
@verbatim
MACRO-LIST ::= '(' 'macro' MACRO * ')'
MACRO ::= '(' MACRO-NAME MACRO-ACTION * ')'
MACRO-NAME ::= SYMBOL
MACRO-ACTION ::= ACTION
@endverbatim
@verbatim
MODULE-LIST ::= '(' 'module' MODULE * ')'
MODULE ::= '(' MODULE-NAME FUNCTION * ')'
MODULE-NAME ::= SYMBOL
FUNCTION ::= SYMBOL
@endverbatim
Each @c MODULE declares the name of external module (i.e. dynamic
library) and function names exported by the module. If a @c FUNCTION has
name "init", it is called with only the default arguments (see the
section about @c CALL) when an input context is created for the input
method. If a @c FUNCTION has name "fini", it is called with only the
default arguments when an input context is destroyed.
@verbatim
STATE-LIST ::= '(' 'state' STATE * ')'
STATE ::= '(' STATE-NAME BRANCH * ')'
STATE-NAME ::= SYMBOL
BRANCH ::= '(' MAP-NAME BRANCH-ACTION * ')'
| '(' nil BRANCH-ACTION * ')'
| '(' t BRANCH-ACTION * ')'
@endverbatim
In the first form of @c BRANCH, @c MAP-NAME must be an item that appears
in @c MAP. In this case, if a key sequence matching one of @c
KEYSEQs of @c MAP-NAME is typed, @c BRANCH-ACTIONs are executed.
In the second form of @c BRANCH, @c BRANCH-ACTIONs are executed if a
key sequence that doesn't match any of @c Branch's of the current
state is typed.
In the third form of @c BRANCH, @c BRANCH-ACTIONs are executed if we
shift to the current state after handling all typed keys. If the
current state is the initial state, @c BRANCH-ACTIONs are executed
just after an input context of the input method is created.
@verbatim
BRANCH-ACTION ::= ACTION
@endverbatim
An input method has the following two lists of symbols.
- marker list
A marker is a symbol indicating a character position in the preediting
text. The @c MARK action assigns a position to a marker. The
position of a marker is referred by the @c MOVE and the @c DELETE actions.
- variable list
A variable is a symbol associated with an integer value. The value of
a variable is set by the @c SET action, and is referred by the @c SET,
the @c INSERT, and the @c IF actions. All variables are implicitly
initialized to zero.
Each @c PREDEFINED-SYMBOL has a special meaning when used as a marker.
- @c @@<, @c @@=, @c @@>
The first, the current, and the last position.
- @c @@-, @c @@+
The previous and the next position.
- @c @@[, @c @@]
The previous and the next position where a candidate list changes.
Each @c PREDEFINED-SYMBOL has a special meaning when used as a candidate
index in the @c SELECT action.
- @c @@<, @c @@=, @c @@>
The first, the current, and the last candidate of the current candidate group.
- @c @@-
The previous candidate. If the current candidate is the first one in
the current candidate group, then it means the last candidate in the
previous candidate group.
- @c @@+
The next candidate. If the current candidate is the last one in the
current candidate group, then it means the first candidate in the next
candidate group.
- @c @@[, @c @@]
The candidate in the previous and the next candidate group having the same
candidate index as the current one.
The arguments and the behavior of each action are listed below.
@verbatim
INSERT ::= '(' 'insert' MTEXT ')'
| MTEXT
| INTEGER
| '(' 'insert' SYMBOL ')'
| '(' 'insert' '(' CANDIDATES * ')' ')'
| '(' CANDIDATES * ')'
CANDIDATES ::= MTEXT | '(' MTEXT * ')'
@endverbatim
The first and second forms insert @c MTEXT before the current position.
The third form inserts the character @c INTEGER before the current
position.
The fourth form treats @c SYMBOL as a variable, and inserts its value
(if it is a valid character code) before the current position.
In the fifth and sixth forms, each @c CANDIDATES represents a
candidate group, and each element of @c CANDIDATES represents a
candidate, i.e. if @c CANDIDATES is an M-text, the candidates are the
characters in the M-text; if @c CANDIDATES is a list of M-texts, the
candidates are the M-texts in the list.
These forms insert the first candidate before the current position.
The inserted string is associated with the list of candidates and
the information indicating the currently selected candidate.
The marker positions affected by the insertion are automatically relocated.
@verbatim
DELETE ::= '(' 'delete' SYMBOL ')'
| '(' 'delete' INTEGER ')'
@endverbatim
The first form treats @c SYMBOL as a marker, and deletes characters
between the current position and the marker position.
The second form treats @c INTEGER as a character position, and deletes
characters between the current position and the character position.
The marker positions affected by the deletion are automatically relocated.
@verbatim
SELECT ::= '(' 'select' PREDEFINED-SYMBOL ')'
| '(' 'select' INTEGER ')'
@endverbatim
This action first checks if the character just before the current position
belongs to a string that is associated with a candidate list. If it is,
the action replaces that string with a candidate specified by the
argument.
The first form treats @c PREDEFINED-SYMBOL as a candidate index (as
described above) that specifies a new candidate in the candidate list.
The second form treats @c INTEGER as a candidate index that specifies a
new candidate in the candidate list.
@verbatim
SHOW ::= '(show)'
@endverbatim
This actions instructs the input method driver to display a candidate
list associated with the string before the current position.
@verbatim
HIDE ::= '(hide)'
@endverbatim
This action instructs the input method driver to hide the currently
displayed candidate list.
@verbatim
MOVE ::= '(' 'move' SYMBOL ')'
| '(' 'move' INTEGER ')'
@endverbatim
The first form treats @c SYMBOL as a marker, and makes the marker
position be the new current position.
The second form treats @c INTEGER as a character position, and makes
that position be the new current position.
@verbatim
MARK ::= '(' 'mark' SYMBOL ')'
@endverbatim
This action treats @c SYMBOL as a marker, and sets its position to the
current position. @c SYMBOL must not be a @c PREDEFINED-SYMBOL.
@verbatim
PUSHBACK :: = '(pushback INTEGER)'
@endverbatim
This action pushes back the latest key events to the event queue.
@verbatim
UNDO :: = '(undo)'
@endverbatim
This action cancels the last key event.
@verbatim
SHIFT :: = '(' 'shift' STATE-NAME ')'
@endverbatim
This action shifts the current state to @c STATE-NAME. @c
STATE-NAME must appear in @c STATE-LIST.
@verbatim
CALL ::= '(' 'call' MODULE-NAME FUNCTION ARG * ')'
ARG ::= INTEGER | SYMBOL | MTEXT | PLIST
@endverbatim
This action calls the function @c FUNCTION of external module @c
MODULE-NAME. @c MODULE-NAME and @c FUNCTION must appear in @c
MODULE-LIST.
The function is called with an argument of the type (#MPlist *). The
key of the first element is #Mt and its value is a pointer to an
object of the type #MInputContext. The key of the second element is
#Msymbol and its value is the current state name. @c ARGs are used as
the value of the third and later elements. Their keys are determined
automatically; if an @c ARG is an integer, the corresponding key is
#Minteger; if an @c ARG is a symbol, the corresponding key is
#Msymbol, etc.
The function must return NULL or a value of the type (#MPlist *) that
represents a list of actions to take.
@verbatim
SET ::= '(' OPERAND SYMBOL1 [ INTEGER | SYMBOL2 ] ')'
OPERAND ::= 'set' | 'add' | 'sub' | 'mul' | 'div'
@endverbatim
This action treats @c SYMBOL1 and @c SYMBOL2 as variables and sets the
value of @c SYMBOL1 as below.
If @c OPERAND is 'set', it sets the value of @c SYMBOL1 to @c INTEGER or the
value of @c SYMBOL2.
If @c OPERAND is 'add', it increments the value of @c SYMBOL1 by @c INTEGER
or the value of @c SYMBOL2.
If @c OPERAND is 'sub', it decrements the value of @c SYMBOL1 by @c INTEGER
or the value of @c SYMBOL2.
If @c OPERAND is 'mul', it multiplies the value of @c SYMBOL1 by @c INTEGER
or the value of @c SYMBOL2.
If @c OPERAND is 'div', it divides the value of @c SYMBOL1 by @c INTEGER or
the value of @c SYMBOL2.
@verbatim
IF ::= '(' 'if' CONDITION ACTION-LIST1 ACTION-LIST2 * ')'
CONDITION ::= '(' OPERAND VAL1 VAL2 ')'
ACTION-LIST1 ::= '(' ACTION * ')'
ACTION-LIST2 ::= '(' ACTION * ')'
OPERAND ::= '=' '<' '>'
VAL1 ::= [ INTEGER1 | SYMBOL1 ]
VAL2 ::= [ INTEGER2 | SYMBOL2 ]
@endverbatim
This action performs actions in @c ACTION-LIST1 if @c CONDITION is
true, and performs @c ACTION-LIST2 (if any) otherwise.
@c SYMBOL1 and @c SYMBOL2 are treated as variables.
@ifnot FOR-MAN
@section im-example1 EXAMPLE 1
This is a very simple example for inputting Latin characters with
diacritical marks (acute and cedilla). For instance, when you type:
@verbatim
Comme'die-Franc,ais, chic,,
@endverbatim
you will get this:
@if FOR-HTML
@verbatim
Commédie-Français, chic,
@endverbatim
@endif
@if FOR-LATEX
@latexonly
\hskip5mm\texttt{\footnotesize Comm\'{e}die-Fran\c{c}ais, chic,}
@endlatexonly
@endif
The definition of the input method is very simple as below, and it is
quite straight forward to extend it to cover all Latin characters.
@if FOR-HTML
@verbatim
(title "latin-postfix")
(map
(trans
("a'" ?á) ("e'" ?é) ("i'" ?í) ("o'" ?ó) ("u'" ?ú) ("c," ?ç)
("A'" ?Á) ("E'" ?É) ("I'" ?Í) ("O'" ?Ó) ("U'" ?Ú) ("C," ?Ç)
("a''" "a'") ("e''" "e'") ("i''" "i'") ("o''" "o'") ("u''" "u'")
("c,," "c,")
("A''" "A'") ("E''" "E'") ("I''" "I'") ("O''" "O'") ("U''" "U'")
("C,," "C,")))
(state
(init
(trans)))
@endverbatim
@endif
@if FOR-LATEX
@latexonly
\texttt{\footnotesize
\hskip2mm(title "latin-postfix")\\
\hskip2mm(map\\
\hskip4mm (trans\\
\hskip6mm ("a'" ?\'{a}) ("e'" ?\'{e}) ("i'" ?\'{i}) ("o'" ?\'{o})
("u'" ?\'{u}) ("c," ?\c{c})\\
\hskip6mm ("A'" ?\'{A}) ("E'" ?\'{E}) ("I'" ?\'{I}) ("O'" ?\'{O})
("U'" ?\'{U}) ("C," ?\c{C})\\
\hskip6mm ("a''" "a'") ("e''" "e'") ("i''" "i'") ("o''" "o'") ("u''" "u'")\\
\hskip6mm ("c,," "c,")\\
\hskip6mm ("A''" "A'") ("E''" "E'") ("I''" "I'") ("O''" "O'") ("U''" "U'")\\
\hskip6mm ("C,," "C,")))\\
\hskip2mm(state\\
\hskip4mm (init\\
\hskip6mm (trans)))}
@endlatexonly
@endif
@section im-example2 EXAMPLE 2
This example is for inputting Unicode characters by typing C-u
(Control-u) followed by four hexadecimal numbers. For instance, when
you type ("^u" means Control-u):
@verbatim
^u2190^u2191^u2192^u2193
@endverbatim
you will get this (Unicode arrow symbols):
@verbatim
←↑→↓
@endverbatim
The definition utilizes @c SET and @c IF commands as below:
@verbatim
(title "UNICODE")
(map
(starter
((C-U) "U+"))
(hex
("0" ?0) ("1" ?1) ... ("9" ?9) ("a" ?A) ("b" ?B) ... ("f" ?F)))
(state
(init
(starter (set code 0) (set count 0) (shift unicode)))
(unicode
(hex (set this @-)
(< this ?A
((sub this 48))
((sub this 55)))
(mul code 16) (add code this)
(add count 1)
(= count 4
((delete @<) (insert code) (shift init))))))
@endverbatim
@section im-example3 EXAMPLE 3
This example is for inputting Chinese characters by typing PinYin key
sequence.
@if FOR-HTML
For instance, when you type:
@verbatim
nihaobei2jing2
@endverbatim
you will get:
@verbatim
你好北京
@endverbatim
The definition utilizes @c CANDIDATE and @c SELECT commands as below.
Note that this is just an example, and it ignores such important key
as Backspace.
@verbatim
(title "拼")
(map
;; The initial character of Pinyin.
(starter
("a") ("b") ... ("h") ("j") ... ("t") ("w") ("x") ("y") ("z"))
;; Big table of Pinyin vs the corresponding Chinese characters.
(pinyin
...
("bei" ("被北备背悲辈杯倍贝碑" ...))
("hao" ("好号毫豪浩耗皓嚎昊郝" ...))
("jing" ("经京精境警竟静惊景敬" ...))
("ni" ("你呢尼泥逆倪匿拟腻妮" ...))
...)
;; Typing 1, 2, ..., 0 selects the 0th, 1st, ..., 9th candidate.
(choose
("1" (select 0)) ("2" (select 1)) ... ("9" (select 8)) ("0" (select 9))))
(state
(init
;; When an initial character of Pinyin is typed, re-handle it in
;; "main" state. Anything else is just produced as is.
(starter (show) (pushback 1) (shift main)))
(main
;; When a complete Pinyin sequence is typed, shift to "select" state
;; to allow users to select one from the candidates.
(pinyin (shift select))
;; When anything else is typed, produce the current candidate (if
;; any), and re-handle the last input in "init" state.
(nil (hide) (shift init)))
(select
;; When a number is typed, select the corresponding canidate,
;; produce it, and shift to "init" state.
(choose (hide) (shift init))
;; When anything else is typed, produce the current candidate,
;; and re-handle the last input in "init" state.
(nil (hide) (shift init))))
@endverbatim
@elseif FOR-LATEX
@latexonly
\begin{center}
\fbox{This example is readable only in the documentation of HTML version.}
\end{center}
@endlatexonly
@endif
@endif
@section im-seealso SEE ALSO
@ref mim-list "Input Methods provided by the m17n database",
@ref mdbGeneral "mdbGeneral(5)"
*/
/*
Copyright (C) 2003, 2004
National Institute of Advanced Industrial Science and Technology (AIST)
Registration Number H15PRO112
This file is part of the m17n database; a sub-part of the m17n
library.
The m17n library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1 of
the License, or (at your option) any later version.
The m17n library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the m17n library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307, USA.
*/
/* Local Variables: */
/* coding: utf-8 */
/* End: */