1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
3 <!--Converted with LaTeX2HTML 2K.1beta (1.48)
4 original version by: Nikos Drakos, CBLU, University of Leeds
5 * revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
6 * with significant contributions from:
7 Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
10 <TITLE>ʸ»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹</TITLE>
11 <META NAME="description" CONTENT="ʸ»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹">
12 <META NAME="keywords" CONTENT="main">
13 <META NAME="resource-type" CONTENT="document">
14 <META NAME="distribution" CONTENT="global">
16 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=euc-jp">
17 <META NAME="Generator" CONTENT="LaTeX2HTML v2K.1beta">
18 <META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
20 <LINK REL="STYLESHEET" HREF="main.css">
22 <LINK REL="next" HREF="node5.html">
23 <LINK REL="previous" HREF="node3.html">
24 <LINK REL="up" HREF="main.html">
25 <LINK REL="next" HREF="node5.html">
29 <!--Navigation Panel-->
32 <IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
33 SRC="/usr/share/latex2html/icons/next.png"></A>
36 <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
37 SRC="/usr/share/latex2html/icons/up.png"></A>
40 <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
41 SRC="/usr/share/latex2html/icons/prev.png"></A>
43 <B> Next:</B> <A NAME="tex2html85"
44 HREF="node5.html">Topic Maps ¤Ë´ð¤Å¤¯Âç°èʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹</A>
45 <B> Up:</B> <A NAME="tex2html83"
46 HREF="main.html">2001ǯÅṲ̀Ƨ¥½¥Õ¥È¥¦¥§¥¢ÁϤ»ö¶È ʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ë´ð¤Å¤¯ ʸ»ú¥ª¥Ö¥¸¥§¥¯¥Èµ»½Ñ¤Î¹½ÃÛ ¡Ê·ÀÌóÈÖ¹æ</A>
47 <B> Previous:</B> <A NAME="tex2html77"
48 HREF="node3.html">ʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ë´ð¤Å¤¯Ê¸½ñÊÔ½¸·Ï</A>
51 <!--End of Navigation Panel-->
52 <!--Table of Child-Links-->
53 <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></A>
56 <LI><A NAME="tex2html86"
57 HREF="#SECTION00410000000000000000">ʸ»ú°À̾¤Î̿̾µ¬Ìó</A>
58 <LI><A NAME="tex2html87"
59 HREF="#SECTION00420000000000000000">Éä¹ç°ÌÃÖ°À</A>
60 <LI><A NAME="tex2html88"
61 HREF="#SECTION00430000000000000000">¢Íucs °À</A>
62 <LI><A NAME="tex2html89"
63 HREF="#SECTION00440000000000000000">¢ªdecomposition °À</A>
64 <LI><A NAME="tex2html90"
65 HREF="#SECTION00450000000000000000">´Á»ú¤ÎÉôÉÊÁȤ߹ç¤ï¤»¾ðÊó</A>
67 <!--End of Table of Child-Links-->
70 <H1><A NAME="SECTION00400000000000000000"></A>
71 <A NAME="cha:char-db"></A>
77 ¤³¤ì¤Þ¤Ç½Ò¤Ù¤Æ¤¤¿¤è¤¦¤Ë¡¢¡ØUTF-2000 Êý¼°¡Ù¤Î¼ÂÁõ¡Ê°Ê²¼¤Ç¤Ï¡ØUTF-2000
78 ¼ÂÁõ¡Ù¤È¸Æ¤Ö¡Ë¤Ç¤Ïʸ»ú¤Ë¤ËÂФ·¤Æ²¿¤é¤«¤Î½èÍý¤ò¹Ô¤Ê¤¦»þ¤ËÂоݤȤʤëʸ»ú
79 ¤Î½èÍý¤ËɬÍפÊ°À¤ò»²¾È¤¹¤ëɬÍפ¬¤¢¤ë¡£¤³¤Î¤¿¤á¡¢³Æʸ»ú¤Î°À¤òµ¡³£²Ä
80 ÆɤʷÁ¤Ç³ÊǼ¤·¤¿¥Ç¡¼¥¿¥Ù¡¼¥¹¤¬É¬ÍפȤʤ롣
83 ¤³¤Î¤¿¤á¡¢²æ¡¹¤Ï define-char ·Á¼°¤Çɽ¸½¤µ¤ì¤ëʸ»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹¤ò³«
84 ȯÃæ¤Ç¤¢¤ë¡£¤³¤ì¤Ï¡¢UTF-2000 µ»½Ñ¤Î¼Â¾Ú¤òÌÜŪ¤È¤¹¤ë¤È¤È¤â¤Ë¡¢¾Íè¤Ë¤ª
85 ¤±¤ë UTF-2000 µ»½Ñ¤Ë´ð¤Å¤¯Ê¸»ú¾ðÊó¸ò´¹¤Î¥Ù¡¼¥¹¤È¤Ê¤ëɸ½àŪ¤Ê¥Ç¡¼¥¿¥Ù¡¼
86 ¥¹¤ò¹½ÃÛ¤¹¤ë¤³¤È¤â»ëÌî¤ËÃÖ¤¤¤Æ¤¤¤ë¡£
94 <LI>CNS 11643 ¤È½ô¶¶Âç´Áϼŵ¤ÎÂоÈɽ <A NAME="tex2html11"
95 HREF="footnode.html#foot437"><SUP><IMG ALIGN="BOTTOM" BORDER="1" ALT="[*]"
96 SRC="/usr/share/latex2html/icons/footnote.png"></SUP></A>
98 <LI>CDP (Chinese Document Processing) ¥Ç¡¼¥¿¥Ù¡¼¥¹ <A NAME="tex2html12"
99 HREF="footnode.html#foot438"><SUP><IMG ALIGN="BOTTOM" BORDER="1" ALT="[*]"
100 SRC="/usr/share/latex2html/icons/footnote.png"></SUP></A>
102 <LI>CBETA (Chinese Buddhist Electronic Text Association) ³°»ú¥Ç¡¼¥¿
105 <LI>CHINA3 ³°»ú¥Ç¡¼¥¿¥Ù¡¼¥¹ <A NAME="tex2html13"
106 HREF="footnode.html#foot439"><SUP><IMG ALIGN="BOTTOM" BORDER="1" ALT="[*]"
107 SRC="/usr/share/latex2html/icons/footnote.png"></SUP></A>
109 <LI>³«È¯¼Ô¤¬¤³¤ì¤Þ¤ÇºîÀ®¤·¤Æ¤¤¿¤½¤Î¾¤Î»¨Â¿¤Ê¥Ç¡¼¥¿¥Ù¡¼¥¹
112 Åù¤òÅý¹ç¤·¡¢¸ß¤¤¤ÎÌ·½âÅÀ¤ò½¤Àµ¤¹¤ë¤â¤Î¤Ç¤¢¤ë¡£¤Þ¤À¸í¤ê¤â¿¤¯¡¢ÉʼÁ¤Ï¹â
113 ¤¯¤Ï¤Ê¤¤¤¬¡¢¸½»þÅÀ¤ÇÌó 7 Ëü»úʬ¤ÎÄêµÁ¤¬Â¸ºß¤¹¤ë¡£
116 ¤³¤Îɸ½àʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ç¤Ï¡¢Èó´Á»ú¤Ë´Ø¤·¤Æ¤Ï¤ª¤ª¤à¤Í Unicode
118 HREF="node6.html#Unicode">5</A>] ¤ÎÄêµÁ¤Ë¤Î¤Ã¤È¤Ã¤Æ¤¤¤ë°ìÊý¡¢´Á»ú¤Ë´Ø¤·¤Æ¤ÏÈù¾®¤Ê»úÂκ¹
119 ¤â¶èÊ̤·¤Æ¤¤¤ë¡£´Á»ú¤Î³Æʸ»ú¡Ê»úÂΡˤÎÆâ¡¢Âç´Áϼŵ¤ÈƱ¤¸»úÂΤǤʤ¤¤â
120 ¤Î¤Ë¤Ä¤¤¤Æ¤Ï¡¢ <TT>morohashi-daikanwa</TT> ¤È¤¤¤¦Â°À¤ÎÃͤȤ·¤Æ¡¢Âç´ÁÏÂÈÖ
121 ¹æ¤Èº¹°Û¤ÎÅٹ礪¤è¤ÓÀ°ÍýÈÖ¹æ¤ò»ý¤¿¤»¤Æ¤¤¤ë¡£¤Þ¤¿¡¢<I>The Unicode
123 HREF="node6.html#Unicode">5</A>] ¤ÎÎ㼨»úÂΤÈƱ¤¸»úÂΤǤʤ¤¤â¤Î¤ËÂФ·¤Æ¤Ï¡¢ÂÐ
124 ±þ¤¹¤ë Unicode ¤ÎÉä¹æ°ÌÃÖ¤ò <code>=>ucs</code> <A NAME="tex2html14"
125 HREF="footnode.html#foot445"><SUP><IMG ALIGN="BOTTOM" BORDER="1" ALT="[*]"
126 SRC="/usr/share/latex2html/icons/footnote.png"></SUP></A> ¤È¤¤¤¦Â°À¤ÎÃͤȤ¹¤ë¡£
129 UTF-2000 ¼ÂÁõ¤Ë¤È¤Ã¤Æ¡¢Ê¸»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ï¼ÂÁõ¤ÎµóÆ°¤òÄêµÁ¤¹¤ë¤â¤Î
130 ¤Ç¤¢¤ë¤Î¤Ç¡¢½èÍý¤ËɬÍפÊʸ»ú°À¤È UTF-2000 ¼ÂÁõ¤ÎµóÆ°¤òÂбþÉÕ¤±¤ëɬÍ×
131 ¤¬¤¢¤ë¡£¤¹¤Ê¤ï¤Á¡¢¾¯¤Ê¤¯¤È¤â½èÍý¤ËÍѤ¤¤ëʸ»ú°À¤ËÂФ·¤Æ¤Ï¡¢Ì¾Á°¤ä·¿¤ä
132 °ÕÌ£¤òÄêµÁ¤¹¤ëɬÍפ¬¤¢¤ë¤È¤¤¤¨¤ë¡£¤Þ¤¿¡¢¿Í´Ö¤¬¤³¤Î¥Ç¡¼¥¿¥Ù¡¼¥¹¤ò»ú½ñ¤È
133 ¤·¤ÆÍѤ¤¤ë¾ì¹ç¤ä¥Ç¡¼¥¿¥Ù¡¼¥¹¤Î¥á¥ó¥Æ¥Ê¥ó¥¹¤ò¹Ô¤Ê¤¦¾å¤Ç¤âʸ»ú°À¤Î·Á¼°
134 ¤ä°ÕÌ£¤òµ¬Äꤹ¤ë¤³¤È¤Ï½ÅÍפǤ¢¤ë¡£
137 ¤³¤Î¤è¤¦¤Ê´ÑÅÀ¤Ë´ð¤Å¤¡¢XEmacs UTF-2000 ¤òÂоݤ˴ö¤Ä¤«¤Îʸ»ú°À¤Î̿̾
138 µ¬Ìó¤È´ö¤Ä¤«¤Îʸ»ú°À¤Î·Á¼°¡¦°ÕÌ£¤òµ¬Äꤷ¡¢¤½¤ì¤Ë´ð¤Å¤¯Ê¸»ú°À¥Ç¡¼¥¿
139 ¥Ù¡¼¥¹¤ò³«È¯¤·¤Æ¤¤¤ë¡£¤³¤Î¾Ï¤Ç¤Ïʸ»ú°À¤Ë´Ø¤¹¤ëµ¬Ìó¤Èʸ»ú°À¥Ç¡¼¥¿¥Ù¡¼
144 <H1><A NAME="SECTION00410000000000000000"></A>
145 <A NAME="sec:attribute-naming"></A>
151 UTF-2000 ¥â¥Ç¥ë¤Ï XEmacs UTF-2000 ¤Ï¸½ºß¤Î¤È¤³¤í¡¢
152 <A HREF="node4.html#sec:coded-charset">4.2</A> Àá¤Ç½Ò¤Ù¤ë¡ØÉä¹ç°ÌÃÖ°À¡Ù¤È
153 <A HREF="node4.html#sec:decomposition">4.4</A> Àá¤Ç½Ò¤Ù¤ë ¡Ø<code>->decomposition</code> °À¡Ù¤ò½ü
154 ¤¡¢Ê¸»ú°À¤Î°ÕÌ£¤òµ¬Äꤷ¤Æ¤¤¤Ê¤¤¡£¤·¤«¤·¤Ê¤¬¤é¡¢Ê¸»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹
155 ¤ò¹½ÃÛ¡¦¥á¥ó¥Æ¥Ê¥ó¥¹¤·¤¿¤ê¡¢Ê¸»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹¤òÍøÍѤ¹¤ë¥¢¥×¥ê¥±¡¼¥·¥ç
156 ¥ó¡¦¥×¥í¥°¥é¥à¤ò¼Â¸½¤¹¤ë¾å¤Ç¤Ï°ìÄê¤Îµ¬Ì󤬤¢¤Ã¤¿Êý¤¬Ë¾¤Þ¤·¤¤¤È¤¤¤¨¤ë¡£
157 ¤½¤³¤Ç¡¢·Ð¸³Åª¤Ëʸ»ú°À¤Î̿̾µ¬Ìó¤òÀ°È÷¤·¡¢Ê¸»ú°À¤Î̾¾Î¤Î¥Ñ¥¿¡¼¥ó¤È
158 Âç¤Þ¤«¤Ê·Á¼°¤ª¤è¤Ó°ÕÌ£¤òÂбþÉÕ¤±¤è¤¦¤È¤·¤Æ¤¤¤ë¡£
162 <H2><A NAME="SECTION00411000000000000000">
163 ʸ»ú´Ö¤Î´Ø·¸¤Ë´Ø¤¹¤ë°À</A>
168 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
170 ALT="$C$"> ¤ËÂФ·¤Æ´Ø·¸ <I>foo</I> ¤ò»ý¤Äʸ»ú <IMG
171 WIDTH="18" HEIGHT="28" ALIGN="MIDDLE" BORDER="0"
173 ALT="$\gamma_i$"> ¤¬Â¸ºß¤¹
175 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
177 ALT="$C$"> ¤Î°À <code>-></code><I>foo</I> ¤ÏÃͤγÆÍ×ÁÇ <IMG
178 WIDTH="18" HEIGHT="28" ALIGN="MIDDLE" BORDER="0"
182 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
184 ALT="$C$"> ¤Î<I>foo</I> ¤Ç¤¢¤ë¤³¤È¤ò°ÕÌ£¤¹¤ë¡£¤³¤³¤Ç¡¢<code>-></code><I>foo</I>
186 $\gamma_1 ... \gamma_n$
189 WIDTH="51" HEIGHT="28" ALIGN="MIDDLE" BORDER="0"
191 ALT="$\gamma_1 ... \gamma_n$"> ¤Ï¥ê¥¹¥È¤Ç¤¢¤ë¡£
195 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
197 ALT="$C$"> ¤Î°À <code><-</code><I>foo</I> ¤Ïʸ»ú <IMG
198 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
200 ALT="$C$"> ¤¬ÃͤγÆÍ×ÁÇ
202 WIDTH="19" HEIGHT="28" ALIGN="MIDDLE" BORDER="0"
204 ALT="$\gamma_j$"> ¤Î <I>foo</I> ¤Ç¤¢¤ë¤³¤È¤ò°ÕÌ£¤¹¤ë¡£<code>-></code><I>foo</I>
205 ¤ÈƱÍͤˡ¢<code><-</code><I>foo</I> ¤ÎÃÍ <!-- MATH
206 $\gamma_1 ... \gamma_m$
209 WIDTH="54" HEIGHT="28" ALIGN="MIDDLE" BORDER="0"
211 ALT="$\gamma_1 ... \gamma_m$"> ¤â¥ê¥¹¥È
216 <DT><STRONG>Îã</STRONG></DT>
217 <DD>¾®Ê¸»ú¤òɽ¤¹´Ø·¸¤ò <B>lowercase</B> ¤È¤¹¤ë»þ¡¢
219 <DT><STRONG>(a)</STRONG></DT>
220 <DD>ʸ»ú <B>A</B> ¤Î°À <code>(->lowercase ?a)</code> ¤Ï¡¢Ê¸»ú
221 <B>a</B> ¤¬Ê¸»ú <B>A</B> ¤Î¾®Ê¸»ú¤Ç¤¢¤ë¤³¤È¤òɽ¤·¤Æ¤¤¤ë¡£
224 <DT><STRONG>(b)</STRONG></DT>
225 <DD>ʸ»ú <B>a</B> ¤Î°À <code>(<-lowercase ?A)</code> ¤Ï¡¢Ê¸»ú
226 <B>a</B> ¤¬Ê¸»ú <B>A</B> ¤Î¾®Ê¸»ú¤Ç¤¢¤ë¤³¤È¤òɽ¤·¤Æ¤¤¤ë¡£
235 <H2><A NAME="SECTION00412000000000000000"></A>
236 <A NAME="Character_Specification"></A>
242 UTF-2000 ¼ÂÁõ¤ÎÃæ¤Ç¤Ïʸ»ú¤Ï¥ª¥Ö¥¸¥§¥¯¥È¤Î°ì¼ï¤È¤·¤ÆÉä¹æ²½¤»¤º¤Ë°·¤¦¤³
243 ¤È¤¬¤Ç¤¤ë¤¬¡¢UTF-2000 ¼ÂÁõ¤Î³°¤ÎÀ¤³¦¤È¤Î´Ö¤Ç¤Ï¤Ê¤ó¤é¤«¤ÎËÝÌõ¼êË¡¤¬É¬
247 ¤³¤Î¤È¤¡¢¤â¤·Éä¹æ²½Ê¸»ú½¸¹ç¤¬ÍøÍѲÄǽ¤Ç¤«¤Äɽ¸½¤·¤¿¤¤Ê¸»ú¤ò½½Ê¬¤Ëɽ¸½
248 ²Äǽ¤Ç¤¢¤ë¤Ê¤é¡¢¤½¤ÎÉä¹æ²½Ê¸»ú½¸¹ç¤ÎÉä¹ç°ÌÃÖ¤ò»È¤Ã¤Æ¤½¤Îʸ»ú¤òɽ¸½¤¹¤ë
249 ¤³¤È¤¬¤Ç¤¤ë¡Ê¤³¤ÎÌÜŪ¤Î¤¿¤á¤Ë XEmacs UTF-2000 ¤Ï
250 <A HREF="node4.html#sec:coded-charset">4.2</A> Àá¤Ç½Ò¤Ù¤¿ <I>coded-charset</I> µ¡Ç½¤òÍÑ°Õ¤·¤Æ
251 ¤¤¤ë¡Ë¡£¤·¤«¤·¤Ê¤¬¤é¡¢É½¸½¤·¤¿¤¤Ê¸»ú¤¬ÍøÍѲÄǽ¤ÊÉä¹æ²½Ê¸»ú½¸¹ç¤Ë¼ýÏ¿¤µ
252 ¤ì¤Æ¤¤¤Ê¤¤¾ì¹ç¤ä¡¢Âбþ¤¹¤ëʸ»ú¤¬¼ýÏ¿¤µ¤ì¤Æ¤¤¤Æ¤â¤½¤³¤Çµ¬Äꤵ¤ì¤¿Ãê¾Ýʸ
253 »ú¤Èɽ¸½¤·¤¿¤¤Ê¸»ú¤È¤Îº¹°Û¤¬µöÍƤǤ¤Ê¤¤¾ì¹ç¡¢Éä¹æ²½Ê¸»ú½¸¹ç¤òÍѤ¤¤ë¤³
257 ¤½¤Î¤è¤¦¤ÊÌäÂê¤ò²ò·è¤¹¤ë¤¿¤á¤Ë¤Ï UTF-2000 Êý¼°¤Ë´ð¤Å¤Ê¸»ú¥ª¥Ö¥¸¥§¥¯¥È
258 ¤ÎÀ¼Á¤òÎóµó¤¹¤ë¤è¤¦¤Ê·Á¼°¤¬¤¢¤ì¤ÐÎɤ¤¡£¤³¤¦¤·¤¿¤â¤Î¤È¤·¤Æ XEmacs
259 UTF-2000 ¤Ç¤Ï<I>ʸ»ú»ØÄê (character-specification; char-spec)</I>·Á¼°
263 ʸ»ú»ØÄê¤Î·Á¼°¤Ï Lisp ¤ÎÏ¢Áۥꥹ¥È (association-list) ¤Ç¡¢¥ê¥¹¥È¤Î³ÆÍ×
264 ÁǤ¬³Æʸ»ú°À¤òɽ¸½¤¹¤ë¡£Ï¢Áۥꥹ¥È¤Î¸° (key) Éô¡Ê³ÆÍ×ÁǤÎÀèƬ (car)
265 Éô¡Ë¤¬Â°À̾¤òɽ¤·¡¢Ï¢Áۥꥹ¥È¤ÎÃÍ (value) Éô¡Ê³ÆÍ×ÁǤλĤê (cdr) Éô¡Ë
269 ʸ»ú»ØÄ꤬ɽ¤¹°ÕÌ£¤Ï¤½¤ÎÀ¼Á¤òͤ¹¤ëÃê¾Ýʸ»ú¡Ê¶ñ¾Ýʸ»ú¡Ê½ñ¤«¤ì¤¿Ê¸»ú¡Ë
273 ¤Ê¤ª¡¢¤³¤Î·Á¼°¤Ï´Ø¿ô define-char ¤Î°ú¿ô¤Ç»ØÄꤵ¤ì¤ë¤â¤Î¤ÈƱ¤¸¤Ç¤¢¤ë¡£
277 <H2><A NAME="SECTION00413000000000000000">
282 ʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ë¤ª¤¤¤Æʸ»ú´Ö¤Î´Ø·¸¤òµ½Ò¤¹¤ë¤è¤¦¤Ê¾ì¹ç¡¢Ãͤ˵ºÜ¤¹¤ë
283 ʸ»ú¤Î¾¤Ëʸ»ú´Ö¤Î´Ø·¸¤Ë¤â°À¤òÉÕ¤±¤¿¤¤¾ì¹ç¤¬¤¢¤ë¡£Î㤨¤Ð¡¢Ê¸»ú¤ÎÀµµ¬
284 ²½¤ò¹Ô¤Ê¤¦¾ì¹ç¡¢¥¢¥×¥ê¥±¡¼¥·¥ç¥ó¤Ë¤è¤Ã¤ÆÀµµ¬²½µ¬Â§¤¬°Û¤Ê¤ë¤Î¤Ç¡¢¤³¤Î¤¿
285 ¤á¤Î°ÛÂλú¥Ç¡¼¥¿¥Ù¡¼¥¹¤òºî¤ë¾ì¹ç¡¢¤É¤ÎÀµµ¬²½µ¬Â§¤òÍѤ¤¤Æ¤¤¤ë¤«¤òµºÜ¤¹
286 ¤ëɬÍפ¬¤¢¤ë¡£¤Þ¤¿¡¢³Ø½ÑŪ¤Ê¥Ç¡¼¥¿¥Ù¡¼¥¹¤òºî¤ë¾ì¹ç¤Ë¤ª¤¤¤Æ¡¢Ê¸»ú³Ø¾å¤Î
287 ³ØÀ⤬°Û¤Ê¤ë¾ì¹ç¤Ë½Ðŵ¤ä¤É¤Î³ØÀâ¤òÍѤ¤¤Æ¤¤¤ë¤«¤Ê¤É¤òµºÜ¤¹¤ëɬÍפ¬¤¢¤ë¡£
288 ¤³¤Î¾¡¢ÃÎŪºâ»º¸¢¤Î´ÉÍý¤ò¹Ô¤Ê¤¦¾ì¹ç¤Ë¤â¥Ç¡¼¥¿¤Î½Ðŵ¤ä¸¢Íø¾ðÊó¤òµÏ¿¤¹
292 ¤³¤Î¤è¤¦¤ÊÌÜŪ¤Î¤¿¤á¤Ë¡¢XEmacs UTF-2000 ¤Ç¤Ï<I>ʸ»ú»²¾È
293 (character-reference; char-ref)</I> ·Á¼°¤òµ¬Äꤷ¤Æ¤¤¤ë¡£
296 ʸ»ú»²¾È¤Î·Á¼°¤Ï Lisp ¤Î°À¥ê¥¹¥È (property-list) ¤Ç¤¢¤ë¡£¤³¤³¤Ç¤ÏǤ
297 °Õ¤Î°À¤¬ÍøÍѲÄǽ¤Ç¤¢¤ë¤¬¡¢´ö¤Ä¤«¤Î°À̾¤ËÂФ·¤Æ¤Ï¤½¤Î°ÕÌ£¤¬Í½¤áµ¬Äê
301 °Ê²¼¤Ë°ÕÌ£¤¬µ¬Äꤵ¤ì¤Æ¤¤¤ë°À¤Ë¤Ä¤¤¤ÆÀâÌÀ¤¹¤ë¡§
305 <DT><STRONG>:char</STRONG></DT>
309 [·¿] ʸ»ú¡¢¤â¤·¤¯¤Ï¡¢Ê¸»ú»ØÄê
313 <DT><STRONG>:source</STRONG></DT>
317 [·¿] ½Ðŵ¡¦Åµµò¤òɽ¤¹¥·¥ó¥Ü¥ë¡Ê½Ðŵ¥·¥ó¥Ü¥ë¡Ë¤Î¥ê¥¹¥È¡£
320 °Ê²¼¤Ë XEmacs UTF-2000 ¤Î´ðËÜʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤ÇÍѤ¤¤Æ¤¤¤ë½Ðŵ¥·¥ó¥Ü¥ë
321 ¤òÎóµó¤¹¤ë¡£²æ¡¹¤Ï´ÁÀÒ¤ª¤è¤Ó¸½ÂåÃæ¹ñʸ¸¥¤ËÂФ¹¤ë½Ðŵ¥·¥ó¥Ü¥ë¤Ï¹ñºÝŪ¤Ë
322 ÍѤ¤¤é¤ì¤Æ¤¤¤ëÃæ¹ñ¸ì¤Î¥Ô¥ó¥¤¥óɽµ¤òºÎÍѤ¹¤ë¤³¤È¤Ë¤·¤¿¤¬¡¢Îò»ËŪ»ö¾ð¤«
323 ¤éÆüËܸì¥í¡¼¥Þ»úɽµ¤Î¤â¤Î¤â¸ºß¤·¤Æ¤¤¤ë¡£É½ <A HREF="node4.html#tab:source-symbol">4.1.3</A>
324 ¤Ç¤Ï¡¢º£¸åÍѤ¤¤Æ¤¤¤¯½Ðŵ¥·¥ó¥Ü¥ë̾¤ò¡Ö̾Á°¡×¤ËµºÜ¤·¡¢Îò»ËŪ»ö¾ð¤«¤é¸½
325 ºßÍѤ¤¤Æ¤¤¤ëÆüËܸì¥í¡¼¥Þ»úɽµ¤Î½Ðŵ¥·¥ó¥Ü¥ë¤ò¡ÖÂåÂØ̾¾Î¡×¤ËµºÜ¤·¤Æ¤¤
331 <A NAME="tab:source-symbol"></A> <DIV ALIGN="CENTER">
335 <DIV ALIGN="CENTER"> <A NAME="481"></A>
336 <TABLE CELLPADDING=3 BORDER="1">
337 <CAPTION><STRONG>Table 4.1:</STRONG>
338 ʸ»ú»²¾È¤Ë¤ª¤±¤ë :source °À</CAPTION>
339 <TR><TD ALIGN="LEFT">̾Á°</TD>
340 <TD ALIGN="LEFT">ÂåÂØ̾¾Î</TD>
341 <TD ALIGN="LEFT">ÆâÍÆ</TD>
343 <TR><TD ALIGN="LEFT"> </TD>
344 <TD ALIGN="LEFT">chuuka-daijiten</TD>
345 <TD ALIGN="LEFT">Ãæ²ÚÂç»úŵ</TD>
347 <TR><TD ALIGN="LEFT"> </TD>
348 <TD ALIGN="LEFT">doubun-tsuukou</TD>
349 <TD ALIGN="LEFT">ƱʸÄ̹Í</TD>
351 <TR><TD ALIGN="LEFT"> </TD>
352 <TD ALIGN="LEFT">gyokuhen</TD>
353 <TD ALIGN="LEFT">¶ÌÊÔ</TD>
355 <TR><TD ALIGN="LEFT"> </TD>
356 <TD ALIGN="LEFT">henkai</TD>
357 <TD ALIGN="LEFT">ÊÓ³¤</TD>
359 <TR><TD ALIGN="LEFT"> </TD>
360 <TD ALIGN="LEFT">inkai</TD>
361 <TD ALIGN="LEFT">±¤²ñ</TD>
363 <TR><TD ALIGN="LEFT"> </TD>
364 <TD ALIGN="LEFT">inkaiho</TD>
365 <TD ALIGN="LEFT">±¤²ñÊá</TD>
367 <TR><TD ALIGN="LEFT"> </TD>
368 <TD ALIGN="LEFT">jii</TD>
369 <TD ALIGN="LEFT">»ú×Ã</TD>
371 <TR><TD ALIGN="LEFT"> </TD>
372 <TD ALIGN="LEFT">jiiho</TD>
373 <TD ALIGN="LEFT">»ú×ÃÊá</TD>
375 <TR><TD ALIGN="LEFT">jiyun</TD>
376 <TD ALIGN="LEFT">shuuin</TD>
377 <TD ALIGN="LEFT">½¸±¤</TD>
379 <TR><TD ALIGN="LEFT"> </TD>
380 <TD ALIGN="LEFT">kaihen</TD>
381 <TD ALIGN="LEFT">³¤ÊÓ</TD>
383 <TR><TD ALIGN="LEFT">kangxi</TD>
384 <TD ALIGN="LEFT"> </TD>
385 <TD ALIGN="LEFT">¹¯ô¦»úŵ</TD>
387 <TR><TD ALIGN="LEFT"> </TD>
388 <TD ALIGN="LEFT">kouin</TD>
389 <TD ALIGN="LEFT">¹±¤</TD>
391 <TR><TD ALIGN="LEFT">morohashi-daikanwa</TD>
392 <TD ALIGN="LEFT"> </TD>
393 <TD ALIGN="LEFT">Âç´Áϼŵ</TD>
395 <TR><TD ALIGN="LEFT"> </TD>
396 <TD ALIGN="LEFT">ruishuu-meigishou</TD>
397 <TD ALIGN="LEFT">ÎàæÜ̾µÁ¾¶</TD>
399 <TR><TD ALIGN="LEFT"> </TD>
400 <TD ALIGN="LEFT">seiin</TD>
401 <TD ALIGN="LEFT">Àµ±¤</TD>
403 <TR><TD ALIGN="LEFT"> </TD>
404 <TD ALIGN="LEFT">seiji-tsuu</TD>
405 <TD ALIGN="LEFT">Àµ»úÄÌ</TD>
407 <TR><TD ALIGN="LEFT"> </TD>
408 <TD ALIGN="LEFT">setumon-tuukun-teisei</TD>
409 <TD ALIGN="LEFT">ÀâʸÄÌ·±ÄêÀ¼</TD>
411 <TR><TD ALIGN="LEFT">shouwen</TD>
412 <TD ALIGN="LEFT"> </TD>
413 <TD ALIGN="LEFT">Àâʸ²ò»ú</TD>
415 <TR><TD ALIGN="LEFT"> </TD>
416 <TD ALIGN="LEFT">sougen-irai-zokujifu</TD>
417 <TD ALIGN="LEFT">Á׸µ°ÊÍ读úÉè</TD>
419 <TR><TD ALIGN="LEFT">yuquan</TD>
420 <TD ALIGN="LEFT"> </TD>
421 <TD ALIGN="LEFT">¶ÌÀô</TD>
433 <H1><A NAME="SECTION00420000000000000000"></A>
434 <A NAME="sec:coded-charset"></A>
440 <A HREF="node4.html#sec:coded-charset">4.2</A> Àá¤Ç½Ò¤Ù¤¿¤è¤¦¤Ë¡¢XEmacs UTF-2000 ¤Ç¤Ï
441 coded-charset ¤Î̾Á°¤ò°À̾¤È¤¹¤ëʸ»ú°À¤Ï coded-charset ¤Ë¤ª¤±¤ëÉä
442 ¹ç°ÌÃÖ¤òɽ¤¹ÆÃÊ̤Êʸ»ú°À¤Ç¤¢¤ë¡£¤³¤Î°À¤Ë´Ø¤¹¤ë¾ðÊó¤Ï¡¢¥Õ¥¡¥¤¥ëÆþ½Ð
443 ÎϤʤɤˤª¤±¤ëʸ»úÉä¹ç¤ÎÊÑ´¹½èÍý¤Ë¤ª¤¤¤ÆÍøÍѤµ¤ì¤ë¡£
446 Éä¹ç°ÌÃÖ°À¤ÎÃͤηÁ¼°¤ÏÀ°¿ô¤Ç¤¢¤ë¡£À°¿ôÃͤΤȤêÆÀ¤ëÈϰϤÏÂбþ¤¹¤ë
447 coded-charset ¤Ë¤è¤Ã¤ÆÀ©Ì󤵤ì¤ë¡£
450 ¤È¤³¤í¤Ç¡¢¸½ºß¤Î¤È¤³¤í¡¢¤¢¤ëʸ»ú°À̾¤¬Éä¹ç°ÌÃÖ°À̾¤Ç¤¢¤ë¤«¤É¤¦¤«¤Ï¡¢
451 ¤½¤Î°À̾¤ò̾Á°¤È¤¹¤ë coded-charset ¤¬Â¸ºß¤¹¤ë¤«¤Ë¤è¤Ã¤Æ¤¤¤ë¡£¤¹¤Ê¤ï
452 ¤Á¡¢Éä¹ç°ÌÃÖ°À¤òɽ¤¹Â°À̾¤Î̿̾µ¬Ìó¤Ïº£¤Î¤È¤³¤í¸ºß¤·¤Æ¤ª¤é¤º¡¢Â°À
453 ̾¤À¤±¤Ç¤Ï¤½¤Îʸ»ú°À¤¬Éä¹ç°ÌÃÖ°À¤«¤É¤¦¤«¤òµ¡³£Åª¤Ë·èÄꤹ¤ë¤³¤È¤¬¤Ç
454 ¤¤Ê¤¤¡£¸½ºß¤Î XEmacs UTF-2000 ¤Ë¤ª¤¤¤Æ¤ÏÊÌ¤Ë coded-charset ¤¬ÄêµÁ¤µ¤ì
455 ¤ë¤¿¤á¤Ë¤³¤ì¤ÇÌäÂê¤Ï¤Ê¤¤¤Î¤Ç¤¢¤ë¤¬¡¢Ê¸»ú°À¥Ç¡¼¥¿¥Ù¡¼¥¹¤È¤·¤Æ¤Ï²¿¤é¤«
456 ¤Îµ¬Ì󤬤¢¤Ã¤¿Êý¤¬Îɤ¤¤«¤âÃΤì¤Ê¤¤¡£
460 <H1><A NAME="SECTION00430000000000000000">
465 The attribute named <code>=>ucs</code> is used to indicate a UCS code point of
466 a character. If a user would not like to unify characters that are
467 unified in UCS, or would like to define a character that is not
468 included in UCS, this attribute is available to specify the nearest
472 If a user need to refer a code point of UCS, the user can use
474 <TT>(or (get-char-attribute CHAR 'ucs)
475 (get-char-attribute CHAR '<code>=>ucs</code>))</TT>
478 instead of <TT>(get-char-attribute CHAR 'ucs)</TT>.
481 The information of <code>=>ucs</code> attributes are stored in the internal
483 and users can find variant characters corresponding
484 to a UCS code point by the following function:
489 ´Ø¿ô <B>char-variants</B> (<I>character</I>)
492 This function returns variants of <I>character</I>.
493 </BLOCKQUOTE></BLOCKQUOTE>
495 <BLOCKQUOTE><BLOCKQUOTE>Perhaps there are another kind of variant relations, so we are
496 planning to extend this feature more generally.
503 <H1><A NAME="SECTION00440000000000000000"></A>
504 <A NAME="sec:decomposition"></A>
510 The attribute named <code>->decomposition</code> is used to specify combining
511 sequences of composite (precomposed) characters. The value of
512 <code>->decomposition</code> attribute is a list of characters or
513 character-specifications <A HREF="node4.html#Character_Specification">4.1.2</A>, which means
514 that a character defined with a <code>->decomposition</code> attribute can be
515 interpreted as the sequence of characters specified by the value of
517 For example, if <B>á</B> has an attribute
518 <TT>(<code>->decomposition</code> ?a ?´)</TT>,
519 the sequence <TT>(?a ?´)</TT> can be composed into <B>á</B>.
522 This information can be used in the coding-system features, which is
523 code-conversion features of the Mule API <A NAME="tex2html16"
524 HREF="footnode.html#foot511"><SUP><IMG ALIGN="BOTTOM" BORDER="1" ALT="[*]"
525 SRC="/usr/share/latex2html/icons/footnote.png"></SUP></A>.
528 In addition, there is a builtin function to find a precomposed
529 character from a list of combining sequence.
534 ´Ø¿ô <B>get-composite-char</B> (<I>list</I>)
537 This function returns a character composed from elements of the
539 </BLOCKQUOTE></BLOCKQUOTE>
541 <BLOCKQUOTE><BLOCKQUOTE>Each element is a character, an integer or a
542 character-specification. If an element is an integer, it is
543 interpreted as a code point of UCS character.
551 <H1><A NAME="SECTION00450000000000000000">
552 ´Á»ú¤ÎÉôÉÊÁȤ߹ç¤ï¤»¾ðÊó</A>
556 ¿¤¯¤Î´Á»ú¤ÏÊФÈÚդʤɤÎÉôÉʤÎÁȤ߹ç¤ï¤»¤Ë¤è¤Ã¤Æ¹½À®¤µ¤ì¤Æ¤¤¤ë¡£¤·¤«¤·
557 ¤Ê¤¬¤é¡¢½¾Íè¤Î¿¤¯¤ÎÉä¹æ²½ÊýË¡¤Ç¤ÏÁȤ߹ç¤ï¤µ¤ì¤¿´Á»ú¤òñ°Ì¤Ë¤·¤Æ°·¤ï¤ì
558 ¤Æ¤ª¤ê¡¢´Á»ú¤ÎÉôÉÊÁȤ߹ç¤ï¤»¹½Â¤¤Ë´Ø¤¹¤ë¾ðÊó¤ÏÉä¹æ²½¤ÎÂоݤȤʤé¤Ê¤¤¤³
559 ¤È¤¬Â¿¤«¤Ã¤¿¡£¤³¤Î¤¿¤á¡¢¼«Í³¤ËÍøÍѲÄǽ¤Ê¥Ç¡¼¥¿¤ÎÃßÀѤâÉÔ½½Ê¬¤Ç¤¢¤ë¡£
562 ËÜ¥×¥í¥¸¥§¥¯¥È¤Ç¤Ï´Á»ú¤ÎÉôÉÊÁȤ߹ç¤ï¤»¾ðÊó¤òÍøÍѤ¹¤ë¤¿¤á¤Ë
563 <A HREF="node4.html#sec:ideographic-structure">4.5.5</A> Àá¤Ç½Ò¤Ù¤ë
564 <B>ideographic-structure °À</B>¤òÄêµÁ¤·¡¢XEmacs UTF-2000 ¤Ë¤ª¤¤¤Æ
565 ¤³¤Î°À¤ò°·¤¨¤ë¤è¤¦¤Ë¤·¤¿¡£
568 ¤Þ¤¿¡¢´Á»ú¤ÎÉôÉÊÁȤ߹ç¤ï¤»¾ðÊó¤Ë´Ø¤¹¤ë¤³¤ì¤Þ¤Ç¤Î»î¤ß¤òÄ´ºº¤·¡¢¼«Í³¤ËÍø
569 ÍѲÄǽ¤Ê¤â¤Î¤ËÂФ·¤Æ¤Ï ideographic-structure ·Á¼°¤ËÊÑ´¹¤¹¤ë¤¿¤á¤Î¥×¥í
574 <H2><A NAME="SECTION00451000000000000000">
575 What are glyph expressions</A>
579 Kanji characters are very visibly composed not of atomic units, but of
580 a relatively small number of components. The tradition of defining a
581 character by its components is as old as the script itself. Encoding
582 of Kanji in computers, however, has so far failed to take advantage of
583 this structural feature and treated every Kanji as an atomic unit. To
584 make Kanji encoding more efficient, it has been suggested to encode
585 these parts and compose the characters on the fly. This will make the
586 rendering process more complex and potentially less appealing. The
587 feasability of this will depend on the applictions available for
588 rendering and certainly will require more research. Even if not used
589 as the primary encoding in text, however, such a character component
590 database will still serve an important purpose for classifying,
591 analysing and retrieving characters.
594 Since the 1970's, research concerning such an analytic encoding has
595 been conducted in Taiwan, China, Japan and elsewhere. One of the most
596 important and thoroughly researched proposal has been that of Hsieh
597 Ching-chun (¼ÕÀ¶½Ó) of Academia Sinica, Taiwan. Building on previous
598 results, he started in 1990 to build a database of the structure of
599 Kanji characters. Since this work was carried out at his `Chinese
600 Document Processing' lab, it came to be known as the CDP database.
601 Christian Wittern has been involved with this project since 1994.
602 Currently, the database contains glyph expressions of more than 55500
603 characters, including all characters contained in the <I>´Á¸ìÂç¼Åµ</I>.
604 The database has been developed on the Traditional Chinese version of
605 Windows using Access as the database engine. The user interface only
606 runs on versions of Chinese Windows from Windows 95 up to Windows ME.
607 Professor Hsieh graciously gave permission to port the content of the
608 CDP database to the UTF-2000 project and release it under the GPL.
612 <H2><A NAME="SECTION00452000000000000000">
617 The expressions in the CDP database are based on Big5, the local
618 encoding for Kanji characters mostly used in Taiwan. For the purpose
619 of expressing the parts of characters, that are not characters
620 themselves, more than 2000 codepoints from the private use area (PUA)
621 of Big5 had been used. Furthermore, the CDP database uses a set of
622 only three operators for connecting the characters, although in
623 practice, this has been expanded to 11 due to the introduction of
624 shortcut operators for handling multiple occurrences of the same
625 component in one character. Figure <A HREF="node4.html#fig:mitou-report01">4.5.2</A> shows a
626 list of these operators. There are three more operator-like
627 characters, which are used when embedding glyph expressions into
633 <DIV ALIGN="CENTER"><A NAME="fig:mitou-report01"></A><A NAME="533"></A>
635 <CAPTION ALIGN="BOTTOM"><STRONG>Figure 4.1:</STRONG>
636 The connecting operators used in the CDP</CAPTION>
640 $\scalebox{0.5}{\includegraphics{mitou-report01.eps}}$
643 WIDTH="212" HEIGHT="327" ALIGN="BOTTOM" BORDER="0"
644 SRC="images/mitou-report01.jpg"
645 ALT="\scalebox{0.5}{\includegraphics{mitou-report01.eps}}">
652 <H2><A NAME="SECTION00453000000000000000">
653 The CBETA database</A>
657 Another database of Chinese characters and glyph expressions, if
658 somewhat smaller than the CDP with around 13000 characters at the
659 moment, is the database developed by the Chinese Buddhist Electronic
660 Text Association (CBETA). This is a sideproduct of CBETA's
661 groundbreaking work of creating an electronic version of Chinese
662 Buddhist scriptures. So far, more than 80 million characters of text
663 have been input, carefully proofread and marked up in XML according to
664 the Guidelines of the Text Electronic Initiative. The base character
665 set used for this is again Big5. Characters that could not be found
666 in Big5 have been collected and expressed with glyph expressions. The
667 CBETA database again uses a simple system of three basic connecting
668 operators, expressed with ASCII interpunction as follows:
673 <A NAME="tab:cbeta-op"></A> <DIV ALIGN="CENTER">
677 <DIV ALIGN="CENTER"> <A NAME="539"></A>
678 <TABLE CELLPADDING=3 BORDER="1">
679 <CAPTION><STRONG>Table 4.2:</STRONG>
680 The operators used in the CBETA character database</CAPTION>
681 <TR><TD ALIGN="LEFT">operator</TD>
682 <TD ALIGN="LEFT">meaning</TD>
684 <TR><TD ALIGN="LEFT">/</TD>
685 <TD ALIGN="LEFT">top/bottom connection</TD>
687 <TR><TD ALIGN="LEFT"><IMG
688 WIDTH="12" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
691 <TD ALIGN="LEFT">left/right connection</TD>
693 <TR><TD ALIGN="LEFT">@</TD>
694 <TD ALIGN="LEFT">enclosure connection</TD>
701 The CBETA character database avoids the reliance on characters from
702 the PUA. Instead, character components are expressed by using
703 arithmetic operators - and + for deletion and replacement of
704 characters. In this manner, a glyph exression for the character Ãþ
705 could thus be constructed as: [Á×-ÌÚ+ζ], here the part ÌÚ is replaced
706 with ζ. Using this simple arithmetic, a surprisingly large number of
707 characters can be expressed without much effort. Some expressions do
708 however get more complicated, for example
709 [((((
\8f¶¹-¸ý)-¾®)-Æü+(¹©/½½))*»Ù)/»®].
712 Since Christian Wittern has been involved with the CBETA project for
713 some time, it has been possible to gain permission to include the
714 CBETA character database into the UTF-2000 character database. This
715 is especially interesting, since the CBETA data are derived directly
716 from text input and sources for the characters are easily determined,
717 quite contrary to dictionaries and standard documents, where it is not
718 easy to find real world examples for some of the characters.
722 <H2><A NAME="SECTION00454000000000000000"></A>
723 <A NAME="sec:ideographic-structure"></A>
725 ideographic-structure °729 ´Á»ú¤ÎÉôÉÊÁȤ߹ç¤ï¤»¹½Â¤¤Ë´Ø¤¹¤ë¾ðÊó¤ò XEmacs UTF-2000 ¤ÇÍøÍѤ¹¤ë¤¿¤á
730 ¤Ë<B>ideographic-structure °À</B>¤òÄêµÁ¤·¤¿¡£
733 ideographic-structure °À¤Î·¿¤Ï¡¢Ê¸»ú¡¢Ê¸»ú»ØÄê¤Þ¤¿¤Ïʸ»ú»²¾È¤Î¥ê¥¹¥È
737 ideographic-structure °À¤ÎÃͤÎÍ×ÁǤȤ·¤Æ»ØÄꤵ¤ì¤¿Ê¸»ú¡Ê¤ª¤è¤Óʸ»ú»Ø
738 Äê¡¢¤Þ¤¿¤Ï¡¢Ê¸»ú»²¾È¤Î :char °À¤Ç»ØÄꤵ¤ì¤¿Ê¸»ú¡¦Ê¸»ú»²¾È¡Ë¤Ï
739 ideographic-structure °À¤ò¼è¤ë¤³¤È¤¬¤Ç¤¤ë¡£¤³¤ì¤Ë¤è¤ê¡¢
740 ideographic-structure °À¤ÏÆþ¤ì»Ò¹½Â¤¤ò¼è¤ë¤³¤È¤¬¤Ç¤¤ë¡£
744 <H2><A NAME="SECTION00455000000000000000">
745 Extending the UTF-2000 character database</A>
749 The Unicode Standard introduced in version 3.0 a set of socalled
750 `IDEOGRAPHIC DESCRIPTION CHARACTER' (IDC) to allow the construction of
751 Kanji glyph sequences. This set of operators followed a proposal from
752 China, based on research done there and describes 12 operators. For
753 the purpose of using glyph expressions in the UTF-2000 character
754 database, we decided to use the operator set from Unicode/ISO 10646.
755 This set is shown in Figure <A HREF="node4.html#fig:mitou-report02">4.5.4</A>.
760 <DIV ALIGN="CENTER"><A NAME="fig:mitou-report02"></A><A NAME="554"></A>
762 <CAPTION ALIGN="BOTTOM"><STRONG>Figure 4.2:</STRONG>
763 The IDC from Unicode/ISO 10646</CAPTION>
767 $\scalebox{0.5}{\includegraphics{mitou-report02.eps}}$
770 WIDTH="462" HEIGHT="619" ALIGN="BOTTOM" BORDER="0"
771 SRC="images/mitou-report02.gif"
772 ALT="\scalebox{0.5}{\includegraphics{mitou-report02.eps}}">
778 The adaption of the CDP and CBETA database and subsequent inclusion in
779 the UTF-2000 character database thus involved the following steps:
784 <LI>Converting the underlying character code from Big5 to Unicode
786 <LI>Mapping of entries for characters outside of the reference
787 encoding Big5 to Unicode
789 <LI>Mapping of the characters from the PUA to Unicode
792 <LI>Where the previous step did not produce a mapping,
793 a recursive use of IDC was applied where possible
795 <LI>Modify the glyph expressions to adjust for the
796 different scope of the operators
798 <LI>Add new glyph expressions for characters not in CDP
803 Apart from this, some related supporting tasks were also necessary.
804 Since it is difficult to input unknown and rare Kanji characters, a
805 new input method had to be devised. For this purpose, a table of
806 input keys of the Four Corner system originally created by Christian
807 Wittern for the Kanji characters in CNS-11642:1992 as part of the
808 `KanjiBase' has been ported and adopted so that it could be used
809 within UTF-2000. Additionally, input keys for Kanji radicals in
810 different shapes and other characters from Unicode that where not yet
811 covered (essentially, characters with less than 7 strokes) have been
812 added. This newly expanded input table contains now more than 50000
813 input keys and will be part of the UTF-2000 character database.
816 Quite a different problem, that requires further attention is the way
817 the glyph expressions are composed. The CDP database uses a
818 `intuitive' approach and splits characters where the most logical
819 cut-off line is. This is, however not always the ethymological
820 correct way of splitting. In the UTF-2000 character database, we
821 prefer to have ethymological splitting and new expressions are added
822 in this way. The task of systematically identifying and changing the
823 intuitive splittings has not yet been done.
826 The whole process, which is not yet fully completed, involved a
827 tedious and time consuming task of meticulously checking the accuracy
828 of every single entry for more than 70000 characters. At the time of
829 this writing, the porting and checking is done in a first go for more
830 than 40000 characters. This is an important and fundamental addition
831 to the UTF-2000 character database.
836 <DIV ALIGN="CENTER"><A NAME="fig:mitou-report03"></A><A NAME="564"></A>
838 <CAPTION ALIGN="BOTTOM"><STRONG>Figure 4.3:</STRONG>
839 A table of glyph expressions in XEmacs UTF-2000</CAPTION>
843 $\scalebox{0.5}{\includegraphics{mitou-report03.eps}}$
846 WIDTH="610" HEIGHT="487" ALIGN="BOTTOM" BORDER="0"
847 SRC="images/mitou-report03.jpg"
848 ALT="\scalebox{0.5}{\includegraphics{mitou-report03.eps}}">
855 <H2><A NAME="SECTION00456000000000000000">
856 Additional benefits</A>
860 As has been mentioned above, a table of input keys for the Four Corner
861 method has been ported to UTF-2000 to be used as input keys. Since
862 the Four Corner numbers are systematically assigned to the four
863 corners of a character, it is possible to generate new Four Corner
864 values based on existing characters, if the composition of characters
865 is known. Since this information is exactly the content of the glyph
866 expressions, new Four Corner input keys can automatically be
867 generated, thus covering the whole 70000 Unicode characters. This
868 provides also an additional method for proofreading both the glyph
869 expression data and the Four Corner input codes.
875 <!--Navigation Panel-->
878 <IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
879 SRC="/usr/share/latex2html/icons/next.png"></A>
882 <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
883 SRC="/usr/share/latex2html/icons/up.png"></A>
886 <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
887 SRC="/usr/share/latex2html/icons/prev.png"></A>
889 <B> Next:</B> <A NAME="tex2html85"
890 HREF="node5.html">Topic Maps ¤Ë´ð¤Å¤¯Âç°èʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹</A>
891 <B> Up:</B> <A NAME="tex2html83"
892 HREF="main.html">2001ǯÅṲ̀Ƨ¥½¥Õ¥È¥¦¥§¥¢ÁϤ»ö¶È ʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ë´ð¤Å¤¯ ʸ»ú¥ª¥Ö¥¸¥§¥¯¥Èµ»½Ñ¤Î¹½ÃÛ ¡Ê·ÀÌóÈÖ¹æ</A>
893 <B> Previous:</B> <A NAME="tex2html77"
894 HREF="node3.html">ʸ»ú¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ë´ð¤Å¤¯Ê¸½ñÊÔ½¸·Ï</A>
895 <!--End of Navigation Panel-->