1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2 "http://www.w3.org/TR/html4/loose.dtd">
5 <title>CHaracter Information Service Environment</title>
9 [<a href="http://cvs.m17n.org/chise/">m17n.org</a>]
10 [<a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/">
11 Kyoto University, Institute for Research in Humanities, Documentation
12 and Information Center for Chinese Studies
17 <table cellspacing="8">
18 <tr><th align="center" valign="top">
19 <img alt="DICCS" src="images/cm450118-s.jpg">
20 <td align="center" valign="middle">
21 <font size="+3">CHISE project</font>
26 Last modified: Fri Sep 27 00:30:59 JST 2002
29 <b><a href="index.html.ja.iso-2022-jp"><img
30 src="images/japanese-page.png">
34 <h2>About the CHISE Project</h2>
36 The CHISE (CHaracter Information Service Environment) project attempts
37 to collect and organize into a Knowledge-Base information about
38 characters in the scripts of the world. A new processing environment
39 based on this architecture is currently under development.
45 <li>2002-12-07 (sat) The special session on CHISE Project will be held in
46 <a href="http://www.jaet.gr.jp/meeting.html#5">the 5th meeting of the Japan Association for
47 East Asian Text Processing (JAET)</a>
48 at <a href="http://www.hanazono.ac.jp/">Hanazono University</a>, Kyoto.
50 <li>2002-09-20 to 22 <a
51 href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">Tomohiko
53 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">
54 Christian WITTERN</a> made a presentation at the <a
55 href="http://pnc-ecai.oiu.ac.jp/prog2.htm">
56 PNC Annual Conference and Joint Meetings 2002
59 href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/"
60 >Tomohiko MORIOKA</a> gave a presentation at the <a href="http://lc.linux.or.jp/lc2002/">
61 Linux Conference 2002</a>
63 <li>2002-08-21 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
64 XEmacs UTF-2000 0.19 (Koriyama)
65 </a> has been released.
70 <h2>文字知識データベースに基づく文字処理アーキテクチャの開発</h2>
72 <h2>Development of a character processing architecture based on a
73 character knowledge base</h2>
75 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p> <!-- 外部文字データ
76 ベースから文字属性を lazy-loading 可能になりました。IA32 アーキテクチャ
77 で実行形式の大きさが従来約 30 MB だったのが約 15 MB になりました。現在、
78 cvs.m17n.org の /cvs/root のXEmacs モジュールの utf-2000 枝でから
79 anonymous CVS で入手可能です。--> It is now possible to load character
80 attributes from a external database on demand ("lazy loading"). On
81 Intel 32 bit processor architectures, the size of the executable file
82 thus shrinks from the 30 MB required with the traditional built to
83 just about 15 MB. This can now be downloaded from <a
84 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
85 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
86 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
87 accessed by anonymous CVS </p>
90 <h2>A <a name="topicmaps">
91 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
95 In 2001 the prototype of a Topic Map engine has been developed based
96 on <a href="http://www.zope.org/">Zope</a>. This proved less than
97 ideal for this purpose, so the focus for this year is to port this
98 engine to a relational database backend. Currently development
99 continued with PostgreSQL. It is planned to enable Topic Map editing
100 within XEmacs UTF-2000, but also to allow multiple clients in addtion
106 <h2>Database of features of characters</h2>
108 <h3>Database of the component structure of Chinese Characters</h3>
111 Based on the Ideographic Description Characters (IDS) in
112 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
113 that expresses the structure of Chinese Characters using this syntax.
114 At the moment, we are using the characters in the Unicode tables as a
115 reference. The basic <emph>CJK Unified Ideographs</emph>, as well as
116 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
117 than 70000 characters are currently covered.
121 <a href="images/ids-ext-b-1.png">
122 <img align="ids" src="images/ids-ext-b-1-s.png">
124 Table of the component structure database
129 The following tables are currently available via anonymous CVS from <a
130 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
131 href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
133 href="http://cvs.m17n.org/cgi-bin/viewcvs/ids/?cvsroot=chise">ids:</a>
139 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
142 <dd>CJK Unified Ideographs (U+4E00 〜 U+9FA5) of ISO/IEC
146 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
149 <dd>CJK Unified Ideographs Extension A (U+3400 〜 U+4DB5, U+FA1F and
150 U+FA23) of ISO/IEC 10646-1:2000
153 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
156 <dd>CJK Compatibility Ideographs (U+F900 〜 U+FA2D, except U+FA1F
157 and U+FA23) of ISO/IEC 10646-1:2000
160 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
163 <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 〜
164 U-00021FFF) of ISO/IEC 10646-2:2001
167 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
170 <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 〜
171 U-00023FFF) of ISO/IEC 10646-2:2001
173 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
176 <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 〜
177 U-00025FFF) of ISO/IEC 10646-2:2001
179 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
182 <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 〜
183 U-00027FFF) of ISO/IEC 10646-2:2001
185 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
188 <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 〜
189 U-00029FFF) of ISO/IEC 10646-2:2001
191 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
194 <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 〜
195 U-0002A6D6) of ISO/IEC 10646-2:2001
197 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
198 IDS-UCS-Compat-Supplement.txt
200 <dd>CJK Compatibility Ideographs Supplement (U-0002F800 〜
201 U-0002FA1D) of ISO/IEC 10646-2:2001
203 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
206 <dd>Morohashi: Daikanwa Jiten, Volume 1
208 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
211 <dd>Morohashi: Daikanwa Jiten, Volume 2
213 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
216 <dd>Morohashi: Daikanwa Jiten, Volume 3
218 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
221 <dd>Morohashi: Daikanwa Jiten, Volume 4
223 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
226 <dd>Morohashi: Daikanwa Jiten, Volume 5
228 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
231 <dd>Morohashi: Daikanwa Jiten, Volume 6
233 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
236 <dd>Morohashi: Daikanwa Jiten, Volume 7
238 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
241 <dd>Morohashi: Daikanwa Jiten, Volume 8
243 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
246 <dd>Morohashi: Daikanwa Jiten, Volume 9
248 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
251 <dd>Morohashi: Daikanwa Jiten, Volume 10
253 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
256 <dd>Morohashi: Daikanwa Jiten, Volume 11
258 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
261 <dd>Morohashi: Daikanwa Jiten, Volume 12
263 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
266 <dd>Morohashi: Daikanwa Jiten, Additions
268 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
271 <dd>Morohashi: Daikanwa Jiten, Appendix
273 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
276 <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
277 Association (CBETA)</a>
282 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
283 (<a href="http://www.fonts.jp/">
284 Forum for development of on-the-fly generation of Kanji Fonts
286 <a href="http://www.fonts.jp/search.html">
287 Analytic tool for Kanji Fonts (in Japanese)
292 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
293 and Styles</a></h3> <p> In the character database is information about
294 character glyphs and styles collected. This allows to use this
295 information together with the other knowledge about a character in the
296 database to built a system that uses the <a href="#ids">component
297 structure information </a> to assemble the font for a character
298 depending on the contextual requirements from its components. With
299 this system, occurrences of mismatches based on erroneous association
300 or insufficient contextual information are excluded, and it will be
301 possible easily display and print character forms that have not been codified and for
302 which no fonts exists .
305 <a href="http://www.fonts.jp/">
306 Forum for development of on-the-fly generation of Kanji Fonts
311 <h3><a name="network">Mathematical analysis and visualation of
312 character knowledge</a></h3>
314 <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
316 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
317 Network of Words</a>”, <a href="http://arob.cc.oita-u.ac.jp/">
318 Artificial Life and Robotics 2002</a>
319 (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
320 Presentation material
322 <li>Model for the relation of Kanji characters that share a component
325 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
327 src="images/mage1-s.jpg"><br>Image 1</a>
329 <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
331 src="images/mage2-s.jpg"><br>Image 2</a>
334 <!-- <h2>TOMOYO Project</h2> -->
336 <!-- TOMOYO (Text Operation Models and Outfits for Your Objects) -->
337 <!-- プロジェクトは、従来「UTF-2000 プロジェクト」と呼んでいたもので、 -->
338 <!-- 文字知識データベースに基づく -->
339 <!-- 文字処理アーキテクチャを開発するためのプロジェクトです。 -->
344 <h2>Mailing List</h2>
346 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
348 Anybody who would like to take part in the discussion about and
349 development of the CHISE Project, has ideas or questions about the
350 implementation or wishes for new features is welcome to join either
351 the English, or the Japanese or both lists.
353 To become a member in the CHISE mailing, send a message to the
357 <dd><a href="mailto:chise-ja-ctl@m17n.org">
358 chise-ja-ctl@m17n.org</a>
361 <dd><a href="mailto:chise-en-ctl@m17n.org">
362 chise-en-ctl@m17n.org</a>
366 <blockquote>subscribe Your Name</blockquote>
367 in the body of the message. You will then receive a conformation
368 message with the line
371 confirm PASSWORD Your Name
372 </blockquote> You will have to reply to this message to become a member.
376 <h2>Papers and Presentations</h2>
378 <li><a href="xemacs/#presentation">
379 About XEmacs UTF-2000</a>
380 <li><a href="#network">About mathematical analysis of Character Information</a>
383 <li><a href="papers/u2k-plan.ja/">
384 “Model and Implementation of a Next Generation Multilingual
385 Processing System”
386 </a> (in Japanese. October 1999)
387 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
388 “Non-system characters in XML documents”, in:
389 <i>The Frontier of Asian Information Processing</i>
390 [Seminar Series of the National Documentation and
391 Information Centers in Humanities] No. 10, November 2000
392 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
393 “The UTF-2000 Project”, in:
395 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
396 Kanji and Information, No.2</a>, March 2001 (in Japanese)
397 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
398 “CHISE project &emdash; beyond the UTF-2000”,
399 <a href="http://www.m17n.org/m17n2001/">
400 m17n2001: the Fifth International Symposium on Multilingual
401 Information Processing and Open Source Software
403 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
404 “A Short Introduction to UTF-2000 Project”,
405 the First TEI Character Set Issues Working Group (October 2001,
406 University of California, Berkeley, USA).
407 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
408 “What is Digitisation?”, in:
410 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
411 Kanji and Information, No.3</a>, October 2001 (in Japanese).
412 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
413 “The meaning of 'beyond character codes'”, in:
415 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
416 Kanji and Information, No.3</a>, October 2001 (in Japanese).
417 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
418 “Some thoughts on the digitization of Kanji”,
419 <i>Information Technology and the Humanities</i>
420 [Seminar Series of the National Documentation and
421 Information Centers in Humanities] No. 11, November 2001.
422 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">KAMICHI, Koichi</a>,
423 “Building KAGE (Kanji-font Automatic Generating Engine):
424 The Next Gerenation of Kanji Processing beyond the Character Code Model”
425 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
426 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
427 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
428 “Software Review: CHISE Project,”
429 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
430 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
431 <!-- <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA, Tomohiko</a>,
432 <a href="papers/dc2002.pdf">
433 「ポスト文字コード時代の文書処理技術に関する展望」</a>、
435 (全国文献・情報センター人文社会科学学術セミナーシリーズ No.12),
440 <h2><a href="history">History</a></h2>
442 This project was assisted by <a
443 href="http://www.ipa.go.jp/NBP/13nendo/13mito/koubo13.html"><span lang="ja">未踏ソフトウェア創造事業</span>, 2001</a>.
448 <b>[<a href="../">Above</a>]</b>
450 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
455 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies (DICCS)</a>,
456 <a href="http://www.zinbun.kyoto-u.ac.jp/">Institute for Research in the Humanities</a>,
457 <a href="http://www.kyoto-u.ac.jp/">Kyoto University</a>
460 <a href="http://www.m17n.org/">m17n.org (the Organization for Multilingualization)</a>
461 <a href="http://www.aist.go.jp/">(National Institute of Advanced Industrial Science and Technology)</a>
465 <a href="http://www.hanazono.ac.jp/">Hanazono University</a>
468 <a href="http://www.aist.go.jp/">National Institute of Advanced Industrial Science and Technology</a>
471 <a href="http://bioinfo.tmd.ac.jp/">Dept. of Bioinformatics</a>,
472 <a href="http://www.tmd.ac.jp/mri/mri.html">Medical Research Institute</a>,
473 <a href="http://www.tmd.ac.jp/">Tokyo Medical and Dental University</a>
479 Last modified: Fri Nov 8 14:53:35 JST 2002
481 <a href="http://www.aurora.dti.ne.jp/~zom/Counter/index.html">
483 src="http://mousai.as.wakwak.ne.jp/cgi-bin/counterp.cgi?projects_chise-en.log"
489 <!-- Keep this comment at the end of the file
493 time-stamp-line-limit:40