1 /* Copyright (C) 2003, 2004, 2005
2 National Institute of Advanced Industrial Science and Technology (AIST)
3 Registration Number H15PRO112
4 See the end for copying conditions. */
8 @page mdbIM Input Method
10 @section im-description DESCRIPTION
12 The m17n library provides a driver for input methods that are
13 dynamically loadable from the m17n database (see @ref m17nInputMethod
14 @latexonly (P.\pageref{group__m17nInputMethod}) @endlatexonly).
16 This section describes the data format that defines those input
19 @section im-format SYNTAX and SEMANTICS
21 The following data format defines an input method. The driver loads a
22 definition from a file, a stream, etc. The definition is converted
23 into the form of plist in the driver.
27 IM-DECLARATION ? DESCRIPTION ? VARIABLE-LIST ? COMMAND-LIST ?
28 TITLE MAP-LIST MACRO-LIST ? MODULE-LIST ? STATE-LIST
30 IM-DECLARATION ::= '(' 'input-method' LANGUAGE NAME ')'
31 DESCRIPTION ::= '(' 'description' MTEXT ')'
32 VARIABLE-LIST ::= '(' 'variable' VARIABLE-DECLARATION * ')'
33 COMMAND-LIST ::= '(' 'command' COMMAND-DECLARATION * ')'
34 TITLE ::= '(' 'title' TITLE-TEXT ')'
36 VARIABLE-DECLARATION ::=
37 '(' VAR-NAME [ VAR-DESCRIPTION | nil ] VALUE VALUE-CANDIDATE * ')'
39 COMMAND-DECLARATION ::=
40 '(' CMD-NAME [ CMD-DESCRIPTION | nil ] KEYSEQ * ')'
44 IM-DESCRIPTION ::= MTEXT
46 VAR-DESCRIPTION ::= MTEXT
47 VALUE ::= MTEXT | SYMBOL | INTEGER
48 VALUE-CANDIDATE ::= VALUE | '(' RANGE-FROM RANGE-TO ')'
49 RANGE-FROM ::= INTEGER
52 CMD-DESCRIPTION ::= MTEXT
56 @c IM-DECLARATION specifies the language and name of this input
59 @c DESCRIPTION specifies @c MTEXT as the description text of this
62 @c VARIABLE-DECLARATION declares a variable used in this input method.
63 If a variable must be initialized to the default value, or is to be
64 customized by a user, it must be declared here.
66 @c COMMAND-DECLARATION declares a command used in this input method.
67 If a command must be bound to the default key sequence, or is to be
68 customized by a user, it must be declared here.
70 @c TITLE-TEXT is a text displayed on the screen when this input method
74 MAP-LIST ::= '(' 'map' MAP * ')'
76 MAP ::= '(' MAP-NAME RULE * ')'
80 RULE ::= '(' KEYSEQ MAP-ACTION * ')'
82 KEYSEQ ::= MTEXT | '(' [ SYMBOL | INTEGER ] * ')'
85 @c SYMBOL in the definitions of @c MAP-NAME must not be @c t nor @c
88 @c MTEXT in the definition of @c KEYSEQ consists of characters that
89 can be generated by a keyboard. Therefore @c MTEXT usually contains
90 only ASCII characters. However, if the input method is intended to be
91 used, for instance, with a West European keyboard, @c MTEXT may
92 contain Latin-1 characters.
94 @c SYMBOL in the definition of @c KEYSEQ must be the return value of
95 the minput_event_to_key () function.
97 @c INTEGER in the definition of @c KEYSEQ must be a valid character
101 MAP-ACTION ::= ACTION
103 ACTION ::= INSERT | DELETE | SELECT | MOVE | MARK
104 | SHOW | HIDE | PUSHBACK | UNDO | UNHANDLE | SHIFT | CALL
105 | SET | IF | COND | '(' MACRO-NAME ')'
107 PREDEFINED-SYMBOL ::=
108 '@0' | '@1' | '@2' | '@3' | '@4'
109 | '@5' | '@6' | '@7' | '@8' | '@9'
110 | '@<' | '@=' | '@>' | '@-' | '@+' | '@[' | '@]'
115 MACRO-LIST ::= '(' 'macro' MACRO * ')'
117 MACRO ::= '(' MACRO-NAME MACRO-ACTION * ')'
119 MACRO-NAME ::= SYMBOL
121 MACRO-ACTION ::= ACTION
124 MODULE-LIST ::= '(' 'module' MODULE * ')'
126 MODULE ::= '(' MODULE-NAME FUNCTION * ')'
128 MODULE-NAME ::= SYMBOL
133 Each @c MODULE declares the name of external module (i.e. dynamic
134 library) and function names exported by the module. If a @c FUNCTION has
135 name "init", it is called with only the default arguments (see the
136 section about @c CALL) when an input context is created for the input
137 method. If a @c FUNCTION has name "fini", it is called with only the
138 default arguments when an input context is destroyed.
141 STATE-LIST ::= '(' 'state' STATE * ')'
143 STATE ::= '(' STATE-NAME [ STATE-TITLE-TEXT ] BRANCH * ')'
145 STATE-NAME ::= SYMBOL
147 STATE-TITLE-TEXT ::= MTEXT
149 BRANCH ::= '(' MAP-NAME BRANCH-ACTION * ')'
150 | '(' nil BRANCH-ACTION * ')'
151 | '(' t BRANCH-ACTION * ')'
154 The optional @c STATE-TITLE-TEXT specifies a title text displayed on
155 the screen when the input method is in this state. If @c
156 STATE-TITLE-TEXT is omitted, @c TITLE-TEXT is used.
158 In the first form of @c BRANCH, @c MAP-NAME must be an item that
159 appears in @c MAP. In this case, if a key sequence matching one of @c
160 KEYSEQs of @c MAP-NAME is typed, @c BRANCH-ACTIONs are executed.
162 In the second form of @c BRANCH, @c BRANCH-ACTIONs are executed if a
163 key sequence that doesn't match any of @c Branch's of the current
166 In the third form of @c BRANCH, @c BRANCH-ACTIONs are executed when
167 shifted to the current state. If the current state is the initial
168 state, @c BRANCH-ACTIONs are executed also when an input context of
169 the input method is created.
172 BRANCH-ACTION ::= ACTION
175 An input method has the following two lists of symbols.
180 A marker is a symbol indicating a character position in the preediting
181 text. The @c MARK action assigns a position to a marker. The
182 position of a marker is referred by the @c MOVE and the @c DELETE actions.
186 A variable is a symbol associated with an integer value. The value of
187 a variable is set by the @c SET action, and is referred by the @c SET,
188 the @c INSERT, and the @c IF actions. All variables are implicitly
193 Each @c PREDEFINED-SYMBOL has a special meaning when used as a marker.
196 <li> @c @@0, @c @@1, @c @@2, @c @@3, @c @@4, @c @@5, @c @@6, @c @@7, @c @@8, @c @@9
198 The 0th, 1st, 2nd, ... 9th position respectively.
200 <li> @c @@<, @c @@=, @c @@>
202 The first, the current, and the last position.
206 The previous and the next position.
210 The previous and the next position where a candidate list changes.
213 Some of the @c PREDEFINED-SYMBOL has a special meaning when used as a candidate
214 index in the @c SELECT action.
218 <li> @c @@<, @c @@=, @c @@>
220 The first, the current, and the last candidate of the current candidate group.
224 The previous candidate. If the current candidate is the first one in
225 the current candidate group, then it means the last candidate in the
226 previous candidate group.
230 The next candidate. If the current candidate is the last one in the
231 current candidate group, then it means the first candidate in the next
236 The candidate in the previous and the next candidate group having the same
237 candidate index as the current one.
240 And, this also has a special meaning.
245 Number of handled keys at that moment.
249 These are for supporting surround text handling.
254 Here, @c N is a positive integer. The value is a character at Nth
255 previous position from the current caret of the surrounding text.
256 When this is used as the argument of @c delete action, it specifies
257 how many preceding characters in the surround text to delete.
261 Here, @c N is a positive integer. The value is a character at Nth
262 next position from the current caret of the surrounding text.
263 When this is used as the argument of @c delete action, it specifies
264 how many following characters in the surround text to delete.
267 The arguments and the behavior of each action are listed below.
270 INSERT ::= '(' 'insert' MTEXT ')'
273 | '(' 'insert' SYMBOL ')'
274 | '(' 'insert' '(' CANDIDATES * ')' ')'
275 | '(' CANDIDATES * ')'
277 CANDIDATES ::= MTEXT | '(' MTEXT * ')'
280 The first and second forms insert @c MTEXT before the current position.
282 The third form inserts the character @c INTEGER before the current
285 The fourth form treats @c SYMBOL as a variable, and inserts its value
286 (if it is a valid character code) before the current position.
288 In the fifth and sixth forms, each @c CANDIDATES represents a
289 candidate group, and each element of @c CANDIDATES represents a
290 candidate, i.e. if @c CANDIDATES is an M-text, the candidates are the
291 characters in the M-text; if @c CANDIDATES is a list of M-texts, the
292 candidates are the M-texts in the list.
294 These forms insert the first candidate before the current position.
295 The inserted string is associated with the list of candidates and
296 the information indicating the currently selected candidate.
298 The marker positions affected by the insertion are automatically relocated.
301 DELETE ::= '(' 'delete' SYMBOL ')'
302 | '(' 'delete' INTEGER ')'
305 The first form treats @c SYMBOL as a marker, and deletes characters
306 between the current position and the marker position.
308 The second form treats @c INTEGER as a character position, and deletes
309 characters between the current position and the character position.
311 The marker positions affected by the deletion are automatically relocated.
314 SELECT ::= '(' 'select' PREDEFINED-SYMBOL ')'
315 | '(' 'select' INTEGER ')'
318 This action first checks if the character just before the current position
319 belongs to a string that is associated with a candidate list. If it is,
320 the action replaces that string with a candidate specified by the
323 The first form treats @c PREDEFINED-SYMBOL as a candidate index (as
324 described above) that specifies a new candidate in the candidate list.
326 The second form treats @c INTEGER as a candidate index that specifies a
327 new candidate in the candidate list.
333 This actions instructs the input method driver to display a candidate
334 list associated with the string before the current position.
340 This action instructs the input method driver to hide the currently
341 displayed candidate list.
344 MOVE ::= '(' 'move' SYMBOL ')'
345 | '(' 'move' INTEGER ')'
348 The first form treats @c SYMBOL as a marker, and makes the marker
349 position be the new current position.
351 The second form treats @c INTEGER as a character position, and makes
352 that position be the new current position.
355 MARK ::= '(' 'mark' SYMBOL ')'
358 This action treats @c SYMBOL as a marker, and sets its position to the
359 current position. @c SYMBOL must not be a @c PREDEFINED-SYMBOL.
362 PUSHBACK :: = '(' 'pushback' INTEGER ')'
363 | '(' 'pushback' KEYSEQ ')'
366 The first form pushes back the latest @c INTEGER number of key events
367 to the event queue if @c INTEGER is positive, and pushes back all key
368 events if @c INTEGER is zero.
370 The second form pushes back keys in @c KEYSEQ to the event queue.
373 UNDO :: = '(' 'undo' [ INTEGER | SYMBOL ] ')'
376 If there's no argument, this action cancels the last two key events
377 (i.e. the one that invoked this command, and the previous one).
379 If there's an integer argument NUM, it must be positive or negative
380 (not zero). If positive, the NUMth event to the last one are
381 canceled. If negative the last (- NUM) events are canceled.
383 If there's a symbol argument, it must be resolved to an integer number
384 and the number is treated as the actual argument as above.
387 UNHANDLE :: = '(unhandle)'
390 This action commit the current preedit and return the last key as
394 SHIFT :: = '(' 'shift' STATE-NAME ')'
397 This action shifts the current state to @c STATE-NAME. @c
398 STATE-NAME must appear in @c STATE-LIST.
401 CALL ::= '(' 'call' MODULE-NAME FUNCTION ARG * ')'
403 ARG ::= INTEGER | SYMBOL | MTEXT | PLIST
406 This action calls the function @c FUNCTION of external module @c
407 MODULE-NAME. @c MODULE-NAME and @c FUNCTION must appear in @c
410 The function is called with an argument of the type (#MPlist *). The
411 key of the first element is #Mt and its value is a pointer to an
412 object of the type #MInputContext. The key of the second element is
413 #Msymbol and its value is the current state name. @c ARGs are used as
414 the value of the third and later elements. Their keys are determined
415 automatically; if an @c ARG is an integer, the corresponding key is
416 #Minteger; if an @c ARG is a symbol, the corresponding key is
419 The function must return NULL or a value of the type (#MPlist *) that
420 represents a list of actions to take.
423 SET ::= '(' CMD SYMBOL1 EXPRESSION ')'
425 CMD ::= 'set' | 'add' | 'sub' | 'mul' | 'div'
427 EXPRESSION ::= INTEGER | SYMBOL2 | '(' OPERAND EXPRESSION * ')'
429 OPERAND ::= '+' | '-' | '*' | '/' | '|' | '&' | '!'
430 | '=' | '<' | '>' | '<=' | '>='
434 This action treats @c SYMBOL1 and @c SYMBOL2 as variables and sets the
435 value of @c SYMBOL1 as below.
437 If @c CMD is 'set', it sets the value of @c SYMBOL1 to the value of @c
440 If @c CMD is 'add', it increments the value of @c SYMBOL1 by the value
443 If @c CMD is 'sub', it decrements the value of @c SYMBOL1 by the value
446 If @c CMD is 'mul', it multiplies the value of @c SYMBOL1 by the value
449 If @c CMD is 'div', it divides the value of @c SYMBOL1 by the value of
453 IF ::= '(' CONDITION ACTION-LIST1 ACTION-LIST2 ')'
455 CONDITION ::= [ '=' | '<' | '>' | '<=' | '>=' ] EXPRESSION1 EXPRESSION2
457 ACTION-LIST1 ::= '(' ACTION * ')'
459 ACTION-LIST2 ::= '(' ACTION * ')'
462 This action performs actions in @c ACTION-LIST1 if @c CONDITION is
463 true, and performs @c ACTION-LIST2 (if any) otherwise.
465 @c SYMBOL1 and @c SYMBOL2 are treated as variables.
468 COND ::= '(' 'cond' [ '(' EXPRESSION ACTION * ') ] * ')'
471 This action performs the first action @c ACTION whose corresponding
472 @c EXPRESSION has nonzero value.
476 @section im-example1 EXAMPLE 1
478 This is a very simple example for inputting Latin characters with
479 diacritical marks (acute and cedilla). For instance, when you type:
481 Comme'die-Franc,aise, chic,,
486 Commédie-Française, chic,
491 \hskip5mm\texttt{\footnotesize Comm\'{e}die-Fran\c{c}aise, chic,}
495 The definition of the input method is very simple as below, and it is
496 quite straight forward to extend it to cover all Latin characters.
500 (title "latin-postfix")
503 ("a'" ?á) ("e'" ?é) ("i'" ?í) ("o'" ?ó) ("u'" ?ú) ("c," ?ç)
504 ("A'" ?Á) ("E'" ?É) ("I'" ?Í) ("O'" ?Ó) ("U'" ?Ú) ("C," ?Ç)
505 ("a''" "a'") ("e''" "e'") ("i''" "i'") ("o''" "o'") ("u''" "u'")
507 ("A''" "A'") ("E''" "E'") ("I''" "I'") ("O''" "O'") ("U''" "U'")
516 \texttt{\footnotesize
517 \hskip2mm(title "latin-postfix")\\
520 \hskip6mm ("a'" ?\'{a}) ("e'" ?\'{e}) ("i'" ?\'{i}) ("o'" ?\'{o})
521 ("u'" ?\'{u}) ("c," ?\c{c})\\
522 \hskip6mm ("A'" ?\'{A}) ("E'" ?\'{E}) ("I'" ?\'{I}) ("O'" ?\'{O})
523 ("U'" ?\'{U}) ("C," ?\c{C})\\
524 \hskip6mm ("a''" "a'") ("e''" "e'") ("i''" "i'") ("o''" "o'") ("u''" "u'")\\
525 \hskip6mm ("c,," "c,")\\
526 \hskip6mm ("A''" "A'") ("E''" "E'") ("I''" "I'") ("O''" "O'") ("U''" "U'")\\
527 \hskip6mm ("C,," "C,")))\\
534 @section im-example2 EXAMPLE 2
536 This example is for inputting Unicode characters by typing C-u
537 (Control-u) followed by four hexadecimal digits. For instance, when
538 you type ("^u" means Control-u):
540 ^u2190^u2191^u2192^u2193
542 you will get this (Unicode arrow symbols):
547 The definition utilizes @c SET and @c IF commands as below:
554 ("0" ?0) ("1" ?1) ... ("9" ?9) ("a" ?A) ("b" ?B) ... ("f" ?F)))
557 (starter (set code 0) (set count 0) (shift unicode)))
563 (mul code 16) (add code this)
566 ((delete @<) (insert code) (shift init))))))
569 @section im-example3 EXAMPLE 3
571 This example is for inputting Chinese characters by typing PinYin key
574 For instance, when you type:
583 The definition utilizes @c CANDIDATE and @c SELECT commands as below.
584 Note that this is just an example, and it ignores such important key
591 ;; The initial character of Pinyin.
593 ("a") ("b") ... ("h") ("j") ... ("t") ("w") ("x") ("y") ("z"))
595 ;; Big table of Pinyin vs the corresponding Chinese characters.
598 ("bei" ("被北备背悲辈杯倍贝碑" ...))
599 ("hao" ("好号毫豪浩耗皓嚎昊郝" ...))
600 ("jing" ("经京精境警竟静惊景敬" ...))
601 ("ni" ("你呢尼泥逆倪匿拟腻妮" ...))
603 ;; Typing 1, 2, ..., 0 selects the 0th, 1st, ..., 9th candidate.
605 ("1" (select 0)) ("2" (select 1)) ... ("9" (select 8)) ("0" (select 9))))
609 ;; When an initial character of Pinyin is typed, re-handle it in
610 ;; "main" state. Anything else is just produced as is.
611 (starter (show) (pushback 1) (shift main)))
614 ;; When a complete Pinyin sequence is typed, shift to "select" state
615 ;; to allow users to select one from the candidates.
616 (pinyin (shift select))
618 ;; When anything else is typed, produce the current candidate (if
619 ;; any), and re-handle the last input in "init" state.
620 (nil (hide) (shift init)))
623 ;; When a number is typed, select the corresponding canidate,
624 ;; produce it, and shift to "init" state.
625 (choose (hide) (shift init))
627 ;; When anything else is typed, produce the current candidate,
628 ;; and re-handle the last input in "init" state.
629 (nil (hide) (shift init))))
635 \fbox{This example is readable only in the documentation of HTML version.}
642 @section im-seealso SEE ALSO
644 @ref mim-list "Input Methods provided by the m17n database",
645 @ref mdbGeneral "mdbGeneral(5)"
649 Copyright (C) 2003, 2004, 2005
650 National Institute of Advanced Industrial Science and Technology (AIST)
651 Registration Number H15PRO112
653 This file is part of the m17n database; a sub-part of the m17n
656 The m17n library is free software; you can redistribute it and/or
657 modify it under the terms of the GNU Lesser General Public License
658 as published by the Free Software Foundation; either version 2.1 of
659 the License, or (at your option) any later version.
661 The m17n library is distributed in the hope that it will be useful,
662 but WITHOUT ANY WARRANTY; without even the implied warranty of
663 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
664 Lesser General Public License for more details.
666 You should have received a copy of the GNU Lesser General Public
667 License along with the m17n library; if not, write to the Free
668 Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
672 /* Local Variables: */