1 /* Copyright (C) 2003, 2004, 2005
2 National Institute of Advanced Industrial Science and Technology (AIST)
3 Registration Number H15PRO112
4 See the end for copying conditions. */
8 @page mdbIM Input Method
10 @section im-description DESCRIPTION
12 The m17n library provides a driver for input methods that are
13 dynamically loadable from the m17n database (see @ref m17nInputMethod
14 @latexonly (P.\pageref{group__m17nInputMethod}) @endlatexonly).
16 This section describes the data format that defines those input
19 @section im-format SYNTAX and SEMANTICS
21 The following data format defines an input method. The driver loads a
22 definition from a file, a stream, etc. The definition is converted
23 into the form of plist in the driver.
27 IM-DECLARATION ? DESCRIPTION ? VARIABLE-LIST ? COMMAND-LIST ?
28 TITLE MAP-LIST MACRO-LIST ? MODULE-LIST ? STATE-LIST
30 IM-DECLARATION ::= '(' 'input-method' LANGUAGE NAME [ '(version' VERSION ')'] ')'
31 DESCRIPTION ::= '(' 'description' [ MTEXT-OR-GETTEXT | nil] ')'
32 VARIABLE-LIST ::= '(' 'variable' VARIABLE-DECLARATION * ')'
33 COMMAND-LIST ::= '(' 'command' COMMAND-DECLARATION * ')'
34 TITLE ::= '(' 'title' TITLE-TEXT ')'
36 VARIABLE-DECLARATION ::=
37 '(' VAR-NAME [ MTEXT-OR-GETTEXT | nil ] VALUE VALUE-CANDIDATE * ')'
39 COMMAND-DECLARATION ::=
40 '(' CMD-NAME [ MTEXT-OR-GETTEXT | nil ] KEYSEQ * ')'
43 [ MTEXT | '(' '_' MTEXT ')']
48 IM-DESCRIPTION ::= MTEXT
50 VAR-DESCRIPTION ::= MTEXT
51 VALUE ::= MTEXT | SYMBOL | INTEGER
52 VALUE-CANDIDATE ::= VALUE | '(' RANGE-FROM RANGE-TO ')'
53 RANGE-FROM ::= INTEGER
56 CMD-DESCRIPTION ::= MTEXT
60 @c IM-DECLARATION specifies the language and name of this input
63 @c VERSION specifies the required minimum version number of the m17n
64 library. The format is "XX.YY.ZZ" where XX is a major version
65 number, YY is a minor version number, and ZZ is a patch level.
67 @c DESCRIPTION specifies the description text of this input method by
68 MTEXT-OR-GETTEXT. It it takes the second form, the text is translated
69 according to the current locale by "gettext" (if the translation is
72 @c VARIABLE-DECLARATION declares a variable used in this input method.
73 If a variable must be initialized to the default value, or is to be
74 customized by a user, it must be declared here.
76 @c COMMAND-DECLARATION declares a command used in this input method.
77 If a command must be bound to the default key sequence, or is to be
78 customized by a user, it must be declared here.
80 @c TITLE-TEXT is a text displayed on the screen when this input method
84 MAP-LIST ::= '(' 'map' MAP * ')'
86 MAP ::= '(' MAP-NAME RULE * ')'
90 RULE ::= '(' KEYSEQ MAP-ACTION * ')'
92 KEYSEQ ::= MTEXT | '(' [ SYMBOL | INTEGER ] * ')'
95 @c SYMBOL in the definitions of @c MAP-NAME must not be @c t nor @c
98 @c MTEXT in the definition of @c KEYSEQ consists of characters that
99 can be generated by a keyboard. Therefore @c MTEXT usually contains
100 only ASCII characters. However, if the input method is intended to be
101 used, for instance, with a West European keyboard, @c MTEXT may
102 contain Latin-1 characters.
104 @c SYMBOL in the definition of @c KEYSEQ must be the return value of
105 the minput_event_to_key () function. Under the X window system, you
106 can quickly check the value using the @c xev command. For example,
107 the return key, the backspace key, and the 0 key on the keypad are
108 represented as @c (Return) , @c (BackSpace) , and @c (KP_0)
109 respectively. If the shift, control, meta, alt, super, and hyper
110 modifiers are used, they are represented by the S- , C- , M- , A- , s-
111 , and H- prefixes respectively in this oreder. Thus, "return with
112 shift with meta with hyper" is @c (S-M-H-Return) . Note that "a with
113 shift" .. "z with shift" are represented simply as A .. Z . Thus "a
114 with shift with meta with hyper" is @c (M-H-A) .
116 @c INTEGER in the definition of @c KEYSEQ must be a valid character
120 MAP-ACTION ::= ACTION
122 ACTION ::= INSERT | DELETE | SELECT | MOVE | MARK
123 | SHOW | HIDE | PUSHBACK | POP | UNDO | UNHANDLE | SHIFT | CALL
124 | SET | IF | COND | '(' MACRO-NAME ')'
126 PREDEFINED-SYMBOL ::=
127 '@0' | '@1' | '@2' | '@3' | '@4'
128 | '@5' | '@6' | '@7' | '@8' | '@9'
129 | '@<' | '@=' | '@>' | '@-' | '@+' | '@[' | '@]'
131 | '@-0' | '@-N' | '@+N'
134 MACRO-LIST ::= '(' 'macro' MACRO * ')'
136 MACRO ::= '(' MACRO-NAME MACRO-ACTION * ')'
138 MACRO-NAME ::= SYMBOL
140 MACRO-ACTION ::= ACTION
143 MODULE-LIST ::= '(' 'module' MODULE * ')'
145 MODULE ::= '(' MODULE-NAME FUNCTION * ')'
147 MODULE-NAME ::= SYMBOL
152 Each @c MODULE declares the name of external module (i.e. dynamic
153 library) and function names exported by the module. If a @c FUNCTION has
154 name "init", it is called with only the default arguments (see the
155 section about @c CALL) when an input context is created for the input
156 method. If a @c FUNCTION has name "fini", it is called with only the
157 default arguments when an input context is destroyed.
160 STATE-LIST ::= '(' 'state' STATE * ')'
162 STATE ::= '(' STATE-NAME [ STATE-TITLE-TEXT ] BRANCH * ')'
164 STATE-NAME ::= SYMBOL
166 STATE-TITLE-TEXT ::= MTEXT
168 BRANCH ::= '(' MAP-NAME BRANCH-ACTION * ')'
169 | '(' nil BRANCH-ACTION * ')'
170 | '(' t BRANCH-ACTION * ')'
173 The optional @c STATE-TITLE-TEXT specifies a title text displayed on
174 the screen when the input method is in this state. If @c
175 STATE-TITLE-TEXT is omitted, @c TITLE-TEXT is used.
177 In the first form of @c BRANCH, @c MAP-NAME must be an item that
178 appears in @c MAP. In this case, if a key sequence matching one of @c
179 KEYSEQs of @c MAP-NAME is typed, @c BRANCH-ACTIONs are executed.
181 In the second form of @c BRANCH, @c BRANCH-ACTIONs are executed if a
182 key sequence that doesn't match any of @c Branch's of the current
185 In the third form of @c BRANCH, @c BRANCH-ACTIONs are executed when
186 shifted to the current state. If the current state is the initial
187 state, @c BRANCH-ACTIONs are executed also when an input context of
188 the input method is created.
191 BRANCH-ACTION ::= ACTION
194 An input method has the following two lists of symbols.
199 A marker is a symbol indicating a character position in the preediting
200 text. The @c MARK action assigns a position to a marker. The
201 position of a marker is referred by the @c MOVE and the @c DELETE actions.
205 A variable is a symbol associated with an integer value. The value of
206 a variable is set by the @c SET action, and is referred by the @c SET,
207 the @c INSERT, and the @c IF actions. All variables are implicitly
212 Each @c PREDEFINED-SYMBOL has a special meaning when used as a marker.
215 <li> @c @@0, @c @@1, @c @@2, @c @@3, @c @@4, @c @@5, @c @@6, @c @@7, @c @@8, @c @@9
217 The 0th, 1st, 2nd, ... 9th position respectively.
219 <li> @c @@<, @c @@=, @c @@>
221 The first, the current, and the last position.
225 The previous and the next position.
229 The previous and the next position where a candidate list changes.
232 Some of the @c PREDEFINED-SYMBOL has a special meaning when used as a candidate
233 index in the @c SELECT action.
237 <li> @c @@<, @c @@=, @c @@>
239 The first, the current, and the last candidate of the current candidate group.
243 The previous candidate. If the current candidate is the first one in
244 the current candidate group, then it means the last candidate in the
245 previous candidate group.
249 The next candidate. If the current candidate is the last one in the
250 current candidate group, then it means the first candidate in the next
255 The candidate in the previous and the next candidate group having the same
256 candidate index as the current one.
259 And, this also has a special meaning.
264 Number of handled keys at that moment.
268 These are for supporting surround text handling.
273 -1 if surrounding text is supported, -2 if not.
277 Here, @c N is a positive integer. The value is the Nth previous
278 character in the preedit buffer. If there are only M (M<N) previous
279 characters in it, the value is the (N-M)th previous character from the
280 inputting spot. When this is used as the argument of @c delete
281 action, it specifies the number of characters to be deleted.
285 Here, @c N is a positive integer. The value is the Nth following
286 character in the preedit buffer. If there are only M (M<N) following
287 characters in it, the value is the (N-M)th following character from
288 the inputting spot. When this is used as the argument of @c delete
289 action, it specifies the number of characters to be deleted.
292 The arguments and the behavior of each action are listed below.
295 INSERT ::= '(' 'insert' MTEXT ')'
298 | '(' 'insert' SYMBOL ')'
299 | '(' 'insert' '(' CANDIDATES * ')' ')'
300 | '(' CANDIDATES * ')'
302 CANDIDATES ::= MTEXT | '(' MTEXT * ')'
305 The first and second forms insert @c MTEXT before the current position.
307 The third form inserts the character @c INTEGER before the current
310 The fourth form treats @c SYMBOL as a variable, and inserts its value
311 (if it is a valid character code) before the current position.
313 In the fifth and sixth forms, each @c CANDIDATES represents a
314 candidate group, and each element of @c CANDIDATES represents a
315 candidate, i.e. if @c CANDIDATES is an M-text, the candidates are the
316 characters in the M-text; if @c CANDIDATES is a list of M-texts, the
317 candidates are the M-texts in the list.
319 These forms insert the first candidate before the current position.
320 The inserted string is associated with the list of candidates and
321 the information indicating the currently selected candidate.
323 The marker positions affected by the insertion are automatically relocated.
326 DELETE ::= '(' 'delete' SYMBOL ')'
327 | '(' 'delete' INTEGER ')'
330 The first form treats @c SYMBOL as a marker, and deletes characters
331 between the current position and the marker position.
333 The second form treats @c INTEGER as a character position, and deletes
334 characters between the current position and the character position.
336 The marker positions affected by the deletion are automatically relocated.
339 SELECT ::= '(' 'select' PREDEFINED-SYMBOL ')'
340 | '(' 'select' INTEGER ')'
343 This action first checks if the character just before the current position
344 belongs to a string that is associated with a candidate list. If it is,
345 the action replaces that string with a candidate specified by the
348 The first form treats @c PREDEFINED-SYMBOL as a candidate index (as
349 described above) that specifies a new candidate in the candidate list.
351 The second form treats @c INTEGER as a candidate index that specifies a
352 new candidate in the candidate list.
358 This actions instructs the input method driver to display a candidate
359 list associated with the string before the current position.
365 This action instructs the input method driver to hide the currently
366 displayed candidate list.
369 MOVE ::= '(' 'move' SYMBOL ')'
370 | '(' 'move' INTEGER ')'
373 The first form treats @c SYMBOL as a marker, and makes the marker
374 position be the new current position.
376 The second form treats @c INTEGER as a character position, and makes
377 that position be the new current position.
380 MARK ::= '(' 'mark' SYMBOL ')'
383 This action treats @c SYMBOL as a marker, and sets its position to the
384 current position. @c SYMBOL must not be a @c PREDEFINED-SYMBOL.
387 PUSHBACK :: = '(' 'pushback' INTEGER ')'
388 | '(' 'pushback' KEYSEQ ')'
391 The first form pushes back the latest @c INTEGER number of key events
392 to the event queue if @c INTEGER is positive, and pushes back all key
393 events if @c INTEGER is zero.
395 The second form pushes back keys in @c KEYSEQ to the event queue.
398 POP ::= '(' 'pop' ')'
401 This action pops the first key event that is not yet handled from the
405 UNDO :: = '(' 'undo' [ INTEGER | SYMBOL ] ')'
408 If there's no argument, this action cancels the last two key events
409 (i.e. the one that invoked this command, and the previous one).
411 If there's an integer argument NUM, it must be positive or negative
412 (not zero). If positive, from the NUMth to the last events are
413 canceled. If negative the last (- NUM) events are canceled.
415 If there's a symbol argument, it must be resolved to an integer number
416 and the number is treated as the actual argument as above.
419 UNHANDLE :: = '(unhandle)'
422 This action commit the current preedit and return the last key as
426 SHIFT :: = '(' 'shift' STATE-NAME ')'
429 This action shifts the current state to @c STATE-NAME. @c
430 STATE-NAME must appear in @c STATE-LIST.
433 CALL ::= '(' 'call' MODULE-NAME FUNCTION ARG * ')'
435 ARG ::= INTEGER | SYMBOL | MTEXT | PLIST
438 This action calls the function @c FUNCTION of external module @c
439 MODULE-NAME. @c MODULE-NAME and @c FUNCTION must appear in @c
442 The function is called with an argument of the type (#MPlist *). The
443 key of the first element is #Mt and its value is a pointer to an
444 object of the type #MInputContext. The key of the second element is
445 #Msymbol and its value is the current state name. @c ARGs are used as
446 the value of the third and later elements. Their keys are determined
447 automatically; if an @c ARG is an integer, the corresponding key is
448 #Minteger; if an @c ARG is a symbol, the corresponding key is
451 The function must return NULL or a value of the type (#MPlist *) that
452 represents a list of actions to take.
455 SET ::= '(' CMD SYMBOL1 EXPRESSION ')'
457 CMD ::= 'set' | 'add' | 'sub' | 'mul' | 'div'
459 EXPRESSION ::= INTEGER | SYMBOL2 | '(' OPERAND EXPRESSION * ')'
461 OPERAND ::= '+' | '-' | '*' | '/' | '|' | '&' | '!'
462 | '=' | '<' | '>' | '<=' | '>='
466 This action treats @c SYMBOL1 and @c SYMBOL2 as variables and sets the
467 value of @c SYMBOL1 as below.
469 If @c CMD is 'set', it sets the value of @c SYMBOL1 to the value of @c
472 If @c CMD is 'add', it increments the value of @c SYMBOL1 by the value
475 If @c CMD is 'sub', it decrements the value of @c SYMBOL1 by the value
478 If @c CMD is 'mul', it multiplies the value of @c SYMBOL1 by the value
481 If @c CMD is 'div', it divides the value of @c SYMBOL1 by the value of
485 IF ::= '(' CONDITION ACTION-LIST1 ACTION-LIST2 ')'
487 CONDITION ::= [ '=' | '<' | '>' | '<=' | '>=' ] EXPRESSION1 EXPRESSION2
489 ACTION-LIST1 ::= '(' ACTION * ')'
491 ACTION-LIST2 ::= '(' ACTION * ')'
494 This action performs actions in @c ACTION-LIST1 if @c CONDITION is
495 true, and performs @c ACTION-LIST2 (if any) otherwise.
497 @c SYMBOL1 and @c SYMBOL2 are treated as variables.
500 COND ::= '(' 'cond' [ '(' EXPRESSION ACTION * ') ] * ')'
503 This action performs the first action @c ACTION whose corresponding
504 @c EXPRESSION has nonzero value.
508 @section im-example1 EXAMPLE 1
510 This is a very simple example for inputting Latin characters with
511 diacritical marks (acute and cedilla). For instance, when you type:
513 Comme'die-Franc,aise, chic,,
518 Commédie-Française, chic,
523 \hskip5mm\texttt{\footnotesize Comm\'{e}die-Fran\c{c}aise, chic,}
527 The definition of the input method is very simple as below, and it is
528 quite straight forward to extend it to cover all Latin characters.
532 (title "latin-postfix")
535 ("a'" ?á) ("e'" ?é) ("i'" ?í) ("o'" ?ó) ("u'" ?ú) ("c," ?ç)
536 ("A'" ?Á) ("E'" ?É) ("I'" ?Í) ("O'" ?Ó) ("U'" ?Ú) ("C," ?Ç)
537 ("a''" "a'") ("e''" "e'") ("i''" "i'") ("o''" "o'") ("u''" "u'")
539 ("A''" "A'") ("E''" "E'") ("I''" "I'") ("O''" "O'") ("U''" "U'")
548 \texttt{\footnotesize
549 \hskip2mm(title "latin-postfix")\\
552 \hskip6mm ("a'" ?\'{a}) ("e'" ?\'{e}) ("i'" ?\'{i}) ("o'" ?\'{o})
553 ("u'" ?\'{u}) ("c," ?\c{c})\\
554 \hskip6mm ("A'" ?\'{A}) ("E'" ?\'{E}) ("I'" ?\'{I}) ("O'" ?\'{O})
555 ("U'" ?\'{U}) ("C," ?\c{C})\\
556 \hskip6mm ("a''" "a'") ("e''" "e'") ("i''" "i'") ("o''" "o'") ("u''" "u'")\\
557 \hskip6mm ("c,," "c,")\\
558 \hskip6mm ("A''" "A'") ("E''" "E'") ("I''" "I'") ("O''" "O'") ("U''" "U'")\\
559 \hskip6mm ("C,," "C,")))\\
566 @section im-example2 EXAMPLE 2
568 This example is for inputting Unicode characters by typing C-u
569 (Control-u) followed by four hexadecimal digits. For instance, when
570 you type ("^u" means Control-u):
572 ^u2190^u2191^u2192^u2193
574 you will get this (Unicode arrow symbols):
577 $\leftarrow \uparrow \rightarrow \downarrow
586 The definition utilizes @c SET and @c IF commands as below:
593 ("0" ?0) ("1" ?1) ... ("9" ?9) ("a" ?A) ("b" ?B) ... ("f" ?F)))
596 (starter (set code 0) (set count 0) (shift unicode)))
602 (mul code 16) (add code this)
605 ((delete @<) (insert code) (shift init))))))
608 @section im-example3 EXAMPLE 3
610 This example is for inputting Chinese characters by typing PinYin key
613 For instance, when you type:
622 The definition utilizes @c CANDIDATE and @c SELECT commands as below.
623 Note that this is just an example, and it ignores such important key
630 ;; The initial character of Pinyin.
632 ("a") ("b") ... ("h") ("j") ... ("t") ("w") ("x") ("y") ("z"))
634 ;; Big table of Pinyin vs the corresponding Chinese characters.
637 ("bei" ("被北备背悲辈杯倍贝碑" ...))
638 ("hao" ("好号毫豪浩耗皓嚎昊郝" ...))
639 ("jing" ("经京精境警竟静惊景敬" ...))
640 ("ni" ("你呢尼泥逆倪匿拟腻妮" ...))
642 ;; Typing 1, 2, ..., 0 selects the 0th, 1st, ..., 9th candidate.
644 ("1" (select 0)) ("2" (select 1)) ... ("9" (select 8)) ("0" (select 9))))
648 ;; When an initial character of Pinyin is typed, re-handle it in
649 ;; "main" state. Anything else is just produced as is.
650 (starter (show) (pushback 1) (shift main)))
653 ;; When a complete Pinyin sequence is typed, shift to "select" state
654 ;; to allow users to select one from the candidates.
655 (pinyin (shift select))
657 ;; When anything else is typed, produce the current candidate (if
658 ;; any), and re-handle the last input in "init" state.
659 (nil (hide) (shift init)))
662 ;; When a number is typed, select the corresponding canidate,
663 ;; produce it, and shift to "init" state.
664 (choose (hide) (shift init))
666 ;; When anything else is typed, produce the current candidate,
667 ;; and re-handle the last input in "init" state.
668 (nil (hide) (shift init))))
674 \fbox{This example is readable only in the documentation of HTML version.}
681 @section im-seealso SEE ALSO
683 @ref mim-list "Input Methods provided by the m17n database",
684 @ref mdbGeneral "mdbGeneral(5)"
688 Copyright (C) 2003, 2004, 2005
689 National Institute of Advanced Industrial Science and Technology (AIST)
690 Registration Number H15PRO112
692 This file is part of the m17n database; a sub-part of the m17n
695 The m17n library is free software; you can redistribute it and/or
696 modify it under the terms of the GNU Lesser General Public License
697 as published by the Free Software Foundation; either version 2.1 of
698 the License, or (at your option) any later version.
700 The m17n library is distributed in the hope that it will be useful,
701 but WITHOUT ANY WARRANTY; without even the implied warranty of
702 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
703 Lesser General Public License for more details.
705 You should have received a copy of the GNU Lesser General Public
706 License along with the m17n library; if not, write to the Free
707 Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
708 Boston, MA 02110-1301, USA.
711 /* Local Variables: */