414be8ee26de6e979a0b39b96232dbf9f44d677c
[www/chise.git] / index.html.en
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2             "http://www.w3.org/TR/html4/loose.dtd">
3 <html lang="en">
4 <head>
5 <title>CHaracter Information Service Environment</title>
6 </head>
7 <body>
8 <p>
9 [<a href="http://cvs.m17n.org/chise/">m17n.org</a>]
10 [<a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/">
11 Kyoto University, Institute for Research in Humanities, Documentation
12             and Information Center for Chinese Studies
13 </a>]
14 </p>
15
16 <h1>
17 <table cellspacing="8">
18 <tr><th align="center" valign="top">
19 <img alt="DICCS" src="images/cm450118-s.jpg">
20 <td align="center" valign="middle">
21 <font size="+3">CHISE project</font>
22 </table>
23 </h1>
24 <p>
25 <!-- hhmts start -->
26 Last modified: Fri Sep 27 00:30:59 JST 2002
27 <!-- hhmts end -->
28 <br>
29 <b><a href="index.html.ja.iso-2022-jp"><img
30 src="images/japanese-page.png">
31 </a></b><br>
32 <hr>
33
34 <h2>About the CHISE Project</h2>
35 <p>
36 The CHISE (CHaracter Information Service Environment) project attempts
37 to collect and organize into a Knowledge-Base information about
38 characters in the scripts of the world.  A new processing environment
39 based on this architecture is currently under development.
40 </p>
41
42
43 <h2>News</h2>
44 <ul>
45    <li>2002-09-20 to 22 <a
46        href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">Tomohiko
47                          MORIOKA</a> and 
48        <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">
49        Christian WITTERN</a> are presenting at the <a
50        href="http://pnc-ecai.oiu.ac.jp/prog2.htm">
51        PNC Annual Conference and Joint Meetings 2002
52        </a>.
53    <li>2002-09-19 <a
54        href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/"
55        >Tomohiko MORIOKA</a> is presenting at the <a href="http://lc.linux.or.jp/lc2002/">
56        Linux Conference 2002</a>
57    <li>2002-08-21 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
58        XEmacs UTF-2000 0.19 (Koriyama)
59        </a> has been released.
60 </ul>
61
62 <hr>
63 <!--
64 <h2>\e$BJ8;zCN<1%G!<%?%Y!<%9$K4p$E$/J8;z=hM}%"!<%-%F%/%A%c$N3+H/\e(B</h2>
65 -->
66 <h2>Development of a character processing architecture based on a
67 character knowledge base</h2>
68
69 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p> <!-- \e$B30ItJ8;z%G!<%?\e(B
70 \e$B%Y!<%9$+$iJ8;zB0@-$r\e(B lazy-loading \e$B2DG=$K$J$j$^$7$?!#\e(BIA32 \e$B%"!<%-%F%/%A%c\e(B
71 \e$B$G<B9T7A<0$NBg$-$5$,=>MhLs\e(B 30 MB \e$B$@$C$?$N$,Ls\e(B 15 MB \e$B$K$J$j$^$7$?!#8=:_!"\e(B
72 cvs.m17n.org \e$B$N\e(B /cvs/root \e$B$N\e(BXEmacs \e$B%b%8%e!<%k$N\e(B utf-2000 \e$B;^$G$+$i\e(B 
73 anonymous CVS \e$B$GF~<j2DG=$G$9!#\e(B--> It is now possible to load character
74 attributes from a external database on demand ("lazy loading").  On
75 Intel 32 bit processor architectures, the size of the executable file
76 thus shrinks from the 30 MB required with the traditional built to
77 just about 15 MB. This can now be downloaded from <a
78 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
79 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
80 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
81 accessed by anonymous CVS </p>
82
83
84 <h2>A <a name="topicmaps">
85 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
86 character dababase 
87 </a></h2>
88 <p>
89 In 2001 the prototype of a Topic Map engine has been developed based
90 on <a href="http://www.zope.org/">Zope</a>.  This proved less than
91 ideal for this purpose, so the focus for this year is to port this
92 engine to a relational database backend.  Currently development
93 continued with PostgreSQL. It is planned to enable Topic Map editing
94 within  XEmacs UTF-2000, but also to allow multiple clients in addtion
95 to this.
96 </p>
97
98
99
100 <h2>Database of features of characters</h2>
101
102 <h3>Database of the component structure of Chinese Characters</h3>
103
104 <p>
105 Based on the Ideographic Description Characters (IDS) in 
106 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
107 that expresses the structure of Chinese Characters using this syntax. 
108 At the moment, we are using the characters in the Unicode tables as a
109 reference.  The basic <emph>CJK Unified Ideographs</emph>, as well as
110 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
111 than 70000 characters are currently covered.
112 </p>
113
114 <p>
115 <a href="images/ids-ext-b-1.png">
116 <img align="ids" src="images/ids-ext-b-1-s.png">
117 <br>
118 Table of the component structure database
119 </a>
120 </p>
121
122 <p>
123 The following tables are currently available via anonymous CVS from <a
124 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
125 href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a> 
126 as module <a
127 href="http://cvs.m17n.org/cgi-bin/viewcvs/ids/?cvsroot=chise">ids:</a> 
128 </p>
129
130 <blockquote>
131 <dl compact>
132   <dt><a
133 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
134       IDS-UCS-Basic.txt
135       </a>
136   <dd>CJK Unified Ideographs (U+4E00 \e$B!A\e(B U+9FA5) of ISO/IEC
137       10646-1:2000
138
139   <dt><a
140 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
141       IDS-UCS-Ext-A.txt
142       </a>
143   <dd>CJK Unified Ideographs Extension A (U+3400 \e$B!A\e(B U+4DB5, U+FA1F and
144       U+FA23) of ISO/IEC 10646-1:2000
145
146   <dt><a
147       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
148       IDS-UCS-Compat.txt
149       </a>
150   <dd>CJK Compatibility Ideographs (U+F900 \e$B!A\e(B U+FA2D, except U+FA1F
151       and U+FA23) of ISO/IEC 10646-1:2000
152
153   <dt><a
154       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
155        IDS-UCS-Ext-B-1.txt
156       </a>
157   <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 \e$B!A\e(B 
158       U-00021FFF) of ISO/IEC 10646-2:2001
159
160   <dt><a
161       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
162        IDS-UCS-Ext-B-2.txt
163       </a>
164   <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 \e$B!A\e(B 
165       U-00023FFF) of ISO/IEC 10646-2:2001
166   <dt><a
167       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
168        IDS-UCS-Ext-B-3.txt
169       </a>
170   <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 \e$B!A\e(B 
171       U-00025FFF) of ISO/IEC 10646-2:2001
172   <dt><a
173       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
174        IDS-UCS-Ext-B-4.txt
175       </a>
176   <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 \e$B!A\e(B
177       U-00027FFF) of ISO/IEC 10646-2:2001
178   <dt><a
179       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
180        IDS-UCS-Ext-B-5.txt
181       </a>
182   <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 \e$B!A\e(B
183       U-00029FFF) of ISO/IEC 10646-2:2001
184   <dt><a
185       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
186        IDS-UCS-Ext-B-6.txt
187       </a>
188   <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 \e$B!A\e(B
189       U-0002A6D6) of ISO/IEC 10646-2:2001
190   <dt><a
191       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
192       IDS-UCS-Compat-Supplement.txt
193       </a>
194   <dd>CJK Compatibility Ideographs Supplement (U-0002F800 \e$B!A\e(B 
195       U-0002FA1D) of ISO/IEC 10646-2:2001
196   <dt><a
197       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
198       IDS-Daikanwa-01.txt
199       </a>
200   <dd>Morohashi: Daikanwa Jiten, Volume 1
201   <dt><a
202       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
203       IDS-Daikanwa-02.txt
204       </a>
205   <dd>Morohashi: Daikanwa Jiten, Volume 2
206   <dt><a
207       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
208       IDS-Daikanwa-03.txt
209       </a>
210   <dd>Morohashi: Daikanwa Jiten, Volume 3
211   <dt><a
212       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
213       IDS-Daikanwa-04.txt
214       </a>
215   <dd>Morohashi: Daikanwa Jiten, Volume 4
216   <dt><a
217       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
218       IDS-Daikanwa-05.txt
219       </a>
220   <dd>Morohashi: Daikanwa Jiten, Volume 5
221   <dt><a
222       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
223       IDS-Daikanwa-06.txt
224       </a>
225   <dd>Morohashi: Daikanwa Jiten, Volume 6
226   <dt><a
227       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
228       IDS-Daikanwa-07.txt
229       </a>
230   <dd>Morohashi: Daikanwa Jiten, Volume 7
231   <dt><a
232       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
233       IDS-Daikanwa-08.txt
234       </a>
235   <dd>Morohashi: Daikanwa Jiten, Volume 8
236   <dt><a
237       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
238       IDS-Daikanwa-09.txt
239       </a>
240   <dd>Morohashi: Daikanwa Jiten, Volume 9
241   <dt><a
242       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
243       IDS-Daikanwa-10.txt
244       </a>
245   <dd>Morohashi: Daikanwa Jiten, Volume 10
246   <dt><a
247       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
248       IDS-Daikanwa-11.txt
249       </a>
250   <dd>Morohashi: Daikanwa Jiten, Volume 11
251   <dt><a
252       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
253       IDS-Daikanwa-12.txt
254       </a>
255   <dd>Morohashi: Daikanwa Jiten, Volume 12
256   <dt><a
257       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
258       IDS-Daikanwa-dx.txt
259       </a>
260   <dd>Morohashi: Daikanwa Jiten, Additions
261   <dt><a
262       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
263       IDS-Daikanwa-ho.txt
264       </a>
265   <dd>Morohashi: Daikanwa Jiten, Appendix
266   <dt><a
267       href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
268       IDS-CBETA.txt
269       </a>
270   <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
271       Association (CBETA)</a>
272 </dl>
273 </blockquote>
274
275 <ul>
276   <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
277       (<a href="http://www.fonts.jp/">
278       Forum for development of on-the-fly generation of Kanji Fonts
279       </a>)
280       <a href="http://www.fonts.jp/search.html">
281                         Analytic tool for Kanji Fonts (in Japanese)
282       </a>
283 </ul>
284
285
286 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
287 and Styles</a></h3> <p> In the character database is information about
288 character glyphs and styles collected.  This allows to use this
289 information together with the other knowledge about a character in the
290 database to built a system that uses the <a href="#ids">component
291 structure information </a> to assemble the font for a character
292 depending on the contextual requirements from its components.  With
293 this system, occurrences of mismatches based on erroneous association
294 or insufficient contextual information are excluded, and it will be
295 possible easily display and print character forms that have not been codified and for
296 which no fonts exists .
297 <ul>
298   <li>
299       <a href="http://www.fonts.jp/">
300       Forum for development of on-the-fly generation of Kanji Fonts
301       </a>
302 </ul>
303
304
305 <h3><a name="network">Mathematical analysis and visualation of
306 character knowledge</a></h3>
307 <ul>
308   <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
309       Morioka, \e$B!H\e(B<a
310       href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
311       Network of Words</a>\e$B!I\e(B, <a href="http://arob.cc.oita-u.ac.jp/">
312       Artificial Life and Robotics 2002</a>
313       (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
314                         Presentation material
315       </a>)
316   <li>Model for the relation of Kanji characters that share a component
317       <br>
318       <a
319                         href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
320       <img alt="Image 1"
321       src="images/mage1-s.jpg"><br>Image 1</a>
322 &nbsp;<br>
323       <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
324       <img alt="\e$BLO<0?^\e(B2"
325       src="images/mage2-s.jpg"><br>Image 2</a>
326 </ul>
327
328 <!--  <h2>TOMOYO Project</h2> -->
329 <!--  <p> -->
330 <!--  TOMOYO (Text Operation Models and Outfits for Your Objects) -->
331 <!--  \e$B%W%m%8%'%/%H$O!"=>Mh!V\e(BUTF-2000 \e$B%W%m%8%'%/%H!W$H8F$s$G$$$?$b$N$G!"\e(B -->
332 <!--  \e$BJ8;zCN<1%G!<%?%Y!<%9$K4p$E$/\e(B -->
333 <!--  \e$BJ8;z=hM}%"!<%-%F%/%A%c$r3+H/$9$k$?$a$N%W%m%8%'%/%H$G$9!#\e(B -->
334 <!--  </p> -->
335
336
337 <hr>
338 <h2>Mailing List</h2>
339 <p>
340 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
341 <p>
342 Anybody who would like to take part in the discussion about and
343 development of the CHISE Project, has ideas or questions about the
344 implementation or wishes for new features is welcome to join either
345 the English, or the Japanese or both lists.
346 <p>
347 To become a member in the CHISE mailing, send a message to the
348 following adress:
349 <dl compact>
350   <dt>For Japanese:
351   <dd><a href="mailto:chise-ja-ctl@m17n.org">
352       chise-ja-ctl@m17n.org</a>
353
354   <dt>For English:
355   <dd><a href="mailto:chise-en-ctl@m17n.org">
356       chise-en-ctl@m17n.org</a>
357 </dl>
358
359 with the word 
360 <blockquote>subscribe Your Name</blockquote>
361 in the body of the message.  You will then receive a conformation
362 message with the line
363
364 <blockquote>
365 confirm PASSWORD Your Name
366 </blockquote> You will have to reply to this message to become a member.
367
368
369 <hr>
370
371 <h2>Papers and Presentations</h2>
372 <ul>
373   <li><a href="xemacs/#presentation">
374       About XEmacs</a>
375   <li><a href="#network">About mathematical analysis of Character Information</a>
376   <li>Other
377       <ul>
378         <li><a href="papers/u2k-plan.ja/">
379         "Model and Implementation of a Next Generation Multilingual
380         Processing System"
381             </a> (October 1999)
382         <li>WITTERN, Christian, \e$B!H\e(BNon-system characters in XML documents\e$B!I\e(B, in:
383         <i>The Frontier of Asian Information Processing</i>
384         [Seminar Series of the National Documentation and
385                                                         Information Centers in Humanities] No. 10, November 2000
386         <li>MORIOKA Tomohiko, \e$B!V\e(BThe UTF-2000 Project\e$B!W\e(B, in:
387             <a
388             href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
389             Kanji and Information, No.2</a>, March 2001
390         <li>MORIOKA Tomohiko,\e$B!H\e(BCHISE project &emdash; beyond the UTF-2000\e$B!I\e(B,
391             <a href="http://www.m17n.org/m17n2001/">
392             m17n2001: the Fifth International Symposium on Multilingual
393             Information Processing and Open Source Software
394             </a>.
395         <li>MORIOKA Tomohiko, \e$B!H\e(BA Short Introduction to UTF-2000 Project\e$B!I\e(B,
396             the First TEI Character Set Issues Working Group (October 2001,
397             University of California, Berkeley, USA).
398         <li>WITTERN, Christian, "What is Digitisation?", in:
399             <a
400             href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
401             Kanji and Information, No.3</a>, October 2001
402         <li>MORO Shigeki, "The meaning of 'beyond character codes'", in:
403             <a
404             href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
405             Kanji and Information, No.3</a>, October 2001
406         <li>WITTERN, Christian, \e$B!H\e(BSome thoughts on the digitization of Kanji\e$B!I\e(B,
407         <i>Information Technology and the Humanities</i>
408         [Seminar Series of the National Documentation and
409                                                         Information Centers in Humanities] No. 11, November 2001
410       </ul>
411 </ul>
412
413 <h2><a name="history">History</a></h2>
414
415 <hr>
416
417 <br>
418 <b>[<a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies</a> at the 
419 <a href="http://www.zinbun.kyoto-u.ac.jp/">
420 Institute for Research in the Humanities</a>&nbsp;
421 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/">
422 Related Projects
423 </a>]</b>
424 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
425
426 </body>
427 </html>
428 <!-- Keep this comment at the end of the file
429 Local variables:
430 mode: text
431 tab-width: 8
432 time-stamp-line-limit:40
433 -->