1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
4 <title>CHaracter Information Service Environment</title>
5 <link rel=stylesheet href="chise.css" type="text/css">
15 <b><a href="index.html.ja.iso-2022-jp">[Japanese page]</a></b>
20 <a href="http://cvs.m17n.org/chise/"><img
21 alt="m17n.org" src="images/tomura-s.png" align="middle"></a>
22 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/"><img
23 alt="kanji.zinbun.kyoto-u.ac.jp" src="images/diccs-s.jpg" align="middle"></a>
24 <a href="http://mousai.as.wakwak.ne.jp/projects/chise/"><img
25 alt="mousai.as.wakwak.ne.jp" src="images/egret-pond-s.jpg"
32 <h1>CHISE project</h1>
36 <!--<b><a href="index.html.ja.iso-2022-jp"><img
37 src="images/japanese-page.png">
41 <h2>About the CHISE Project</h2>
43 The CHISE (CHaracter Information Service Environment) project attempts
44 to collect and organize into a Knowledge-Base information about
45 characters in the scripts of the world. A new processing environment
46 based on this architecture is currently under development.
51 <li><a href="News/20051013-15.html">CHISE Conference
52 2005</a> will be held this October 13 (Thu), 14 (Fri)
53 at <a href="http://www.kcif.or.jp/en/">Kyoto International
54 Community House</a>.</li>
55 <li><a href="http://mousai.kanji.zinbun.kyoto-u.ac.jp/ids-find">
56 CHISE-IDS Hanzi/Hanja/Kanji Searcher
57 </a>has been published.</li>
58 <!-- <li>2004-06-09 (Wed)
59 Tomohiko Morioka will make a presentation on CHISE Project in
60 <a href="http://kura.hanazono.ac.jp/kanji/20040609symposium.html"
61 >Symposium: <i>Frontier of Character Information Processing:
62 Past, Presenta and Future</i></a>.</li>
64 A presentation on CHISE Project was made in
65 <a href="http://www.sigch.soken.ac.jp/2004.05/">the 62nd meeting of
66 the IPSJ SIG Computers and the Humanities</a>.</li>
67 <li>2003-11-28 (Fri), 29 (Sat)
68 <a href="http://coe21.zinbun.kyoto-u.ac.jp/ws-type-2003">Glyph
69 and Typesetting Workshop</a> was held at
70 <a href="http://www.kcif.or.jp/jp/footer/05.html"
71 >Kyoto City International Foundation</a>.
73 <!-- <li>2003-10-31 (Fri) -->
74 <!-- Presentations on the CHISE project were made in -->
75 <!-- <a href="http://lc.linux.or.jp/lc2003/index.html">Linux Conference -->
84 The CHISE project is the aggregate of the following sub-projects.
88 <li>Development of a character processing architecture based on a
89 character knowledge base
90 <!--文字知識データベースに基づく文字処理アーキテクチャの開発-->
92 <li><a href="xemacs/index.html">XEmacs CHISE</a>
93 <li><a href="ruby/index.html">Ruby/CHISE</a>
94 <li><a href="perl/index.html">Perl/CHISE</a>
97 <li><a href="topicmaps/index.html">A TopicMaps based approach to a
99 <!--TopicMapsによる文字知識データベース・システムの開発--></a></li>
100 <li><a href="char-data/">Database of features of characters
101 <!--文字に関するさまざまな知識のデータベース化--></a>
103 <li><a href="ids/index.html">Database of the component structure of
104 Chinese Characters<!--漢字構造情報データベース--></a></li>
105 <li><a href="glyph/index.html">Intgegration and Composition of
106 Character Glyphs and Styles<!--グリフ・字形情報の統合と合成--></a></li>
109 <li><a href="visualization/index.html">Mathematical analysis and visualation
110 of character knowledge<!--文字知識情報の数理的解析と可視化--></a></li>
111 <li><a href="omega/index.html">Omega/CHISE: Typesetting System in cooperation
112 with character knowledge database
113 <!--文字データベースと連携した組版システム--></a></li>
117 <h2>Development of a character processing architecture based on a
118 character knowledge base</h2>
119 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p>
120 It is now possible to load character
121 attributes from a external database on demand ("lazy loading"). On
122 Intel 32 bit processor architectures, the size of the executable file
123 thus shrinks from the 30 MB required with the traditional built to
124 just about 15 MB. This can now be downloaded from <a
125 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
126 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
127 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
128 accessed by anonymous CVS </p>
130 <h2>A <a name="topicmaps">
131 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
135 In 2001 the prototype of a Topic Map engine has been developed based
136 on <a href="http://www.zope.org/">Zope</a>. This proved less than
137 ideal for this purpose, so the focus for this year is to port this
138 engine to a relational database backend. Currently development
139 continued with PostgreSQL. It is planned to enable Topic Map editing
140 within XEmacs UTF-2000, but also to allow multiple clients in addtion
144 <h2>Database of features of characters</h2>
146 <h3>Database of the component structure of Chinese Characters</h3>
149 Based on the Ideographic Description Characters (IDS) in
150 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
151 that expresses the structure of Chinese Characters using this syntax.
152 At the moment, we are using the characters in the Unicode tables as a
153 reference. The basic <emph>CJK Unified Ideographs</emph>, as well as
154 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
155 than 70000 characters are currently covered.
159 <a href="images/ids-ext-b-1.png">
160 <img align="ids" src="images/ids-ext-b-1-s.png">
162 Table of the component structure database
167 The following tables are currently available via anonymous CVS from <a
168 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
169 href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
171 href="http://cvs.m17n.org/cgi-bin/viewcvs/ids/?cvsroot=chise">ids:</a>
177 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
180 <dd>CJK Unified Ideographs (U+4E00 〜 U+9FA5) of ISO/IEC
184 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
187 <dd>CJK Unified Ideographs Extension A (U+3400 〜 U+4DB5, U+FA1F and
188 U+FA23) of ISO/IEC 10646-1:2000
191 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
194 <dd>CJK Compatibility Ideographs (U+F900 〜 U+FA2D, except U+FA1F
195 and U+FA23) of ISO/IEC 10646-1:2000
198 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
201 <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 〜
202 U-00021FFF) of ISO/IEC 10646-2:2001
205 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
208 <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 〜
209 U-00023FFF) of ISO/IEC 10646-2:2001
211 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
214 <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 〜
215 U-00025FFF) of ISO/IEC 10646-2:2001
217 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
220 <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 〜
221 U-00027FFF) of ISO/IEC 10646-2:2001
223 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
226 <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 〜
227 U-00029FFF) of ISO/IEC 10646-2:2001
229 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
232 <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 〜
233 U-0002A6D6) of ISO/IEC 10646-2:2001
235 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
236 IDS-UCS-Compat-Supplement.txt
238 <dd>CJK Compatibility Ideographs Supplement (U-0002F800 〜
239 U-0002FA1D) of ISO/IEC 10646-2:2001
241 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
244 <dd>Morohashi: Daikanwa Jiten, Volume 1
246 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
249 <dd>Morohashi: Daikanwa Jiten, Volume 2
251 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
254 <dd>Morohashi: Daikanwa Jiten, Volume 3
256 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
259 <dd>Morohashi: Daikanwa Jiten, Volume 4
261 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
264 <dd>Morohashi: Daikanwa Jiten, Volume 5
266 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
269 <dd>Morohashi: Daikanwa Jiten, Volume 6
271 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
274 <dd>Morohashi: Daikanwa Jiten, Volume 7
276 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
279 <dd>Morohashi: Daikanwa Jiten, Volume 8
281 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
284 <dd>Morohashi: Daikanwa Jiten, Volume 9
286 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
289 <dd>Morohashi: Daikanwa Jiten, Volume 10
291 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
294 <dd>Morohashi: Daikanwa Jiten, Volume 11
296 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
299 <dd>Morohashi: Daikanwa Jiten, Volume 12
301 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
304 <dd>Morohashi: Daikanwa Jiten, Additions
306 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
309 <dd>Morohashi: Daikanwa Jiten, Appendix
311 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
314 <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
315 Association (CBETA)</a>
320 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
321 (<a href="http://www.fonts.jp/">
322 Forum for development of on-the-fly generation of Kanji Fonts
324 <a href="http://www.fonts.jp/search.html">
325 Analytic tool for Kanji Fonts (in Japanese)
329 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
330 and Styles</a></h3> <p> In the character database is information about
331 character glyphs and styles collected. This allows to use this
332 information together with the other knowledge about a character in the
333 database to built a system that uses the <a href="#ids">component
334 structure information </a> to assemble the font for a character
335 depending on the contextual requirements from its components. With
336 this system, occurrences of mismatches based on erroneous association
337 or insufficient contextual information are excluded, and it will be
338 possible easily display and print character forms that have not been codified and for
339 which no fonts exists .
342 <a href="http://www.fonts.jp/">
343 Forum for development of on-the-fly generation of Kanji Fonts
348 <h3><a name="network">Mathematical analysis and visualation of
349 character knowledge</a></h3>
351 <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
353 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
354 Network of Words</a>”, <a href="http://arob.cc.oita-u.ac.jp/">
355 Artificial Life and Robotics 2002</a>
356 (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
357 Presentation material
359 <li>Model for the relation of Kanji characters that share a component
362 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
364 src="images/mage1-s.jpg"><br>Image 1</a>
366 <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
368 src="images/mage2-s.jpg"><br>Image 2</a>
373 <h2>CVS Repository</h2>
375 <a href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
379 <h2>Mailing List</h2>
381 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
383 Anybody who would like to take part in the discussion about and
384 development of the CHISE Project, has ideas or questions about the
385 implementation or wishes for new features is welcome to join either
386 the English, or the Japanese or both lists.
388 To become a member in the CHISE mailing, send a message to the
392 <dd><a href="mailto:chise-ja-ctl@m17n.org">
393 chise-ja-ctl@m17n.org</a>
396 <dd><a href="mailto:chise-en-ctl@m17n.org">
397 chise-en-ctl@m17n.org</a>
401 <blockquote>subscribe Your Name</blockquote>
402 in the body of the message. You will then receive a conformation
403 message with the line
406 confirm PASSWORD Your Name
407 </blockquote> You will have to reply to this message to become a member.
411 <h2>Papers and Presentations</h2>
413 <li><a href="xemacs/#presentation">
414 About XEmacs UTF-2000</a>
415 <li><a href="#network">About mathematical analysis of Character Information</a>
418 <li><a href="papers/u2k-plan.ja/">
419 “Model and Implementation of a Next Generation Multilingual
420 Processing System”
421 </a> (in Japanese. October 1999)
422 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
423 “Non-system characters in XML documents”, in:
424 <i>The Frontier of Asian Information Processing</i>
425 [Seminar Series of the National Documentation and
426 Information Centers in Humanities] No. 10, November 2000
427 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
428 “The UTF-2000 Project”, in:
430 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
431 Kanji and Information, No.2</a>, March 2001 (in Japanese)
432 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
433 “CHISE project &emdash; beyond the UTF-2000”,
434 <a href="http://www.m17n.org/m17n2001/">
435 m17n2001: the Fifth International Symposium on Multilingual
436 Information Processing and Open Source Software
438 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
439 “A Short Introduction to UTF-2000 Project”,
440 the First TEI Character Set Issues Working Group (October 2001,
441 University of California, Berkeley, USA).
442 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
443 “What is Digitisation?”, in:
445 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
446 Kanji and Information, No.3</a>, October 2001 (in Japanese).
447 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
448 “The meaning of 'beyond character codes'”, in:
450 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
451 Kanji and Information, No.3</a>, October 2001 (in Japanese).
452 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
453 “Some thoughts on the digitization of Kanji”,
454 <i>Information Technology and the Humanities</i>
455 [Seminar Series of the National Documentation and
456 Information Centers in Humanities] No. 11, November 2001.
457 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">KAMICHI, Koichi</a>,
458 “Building KAGE (Kanji-font Automatic Generating Engine):
459 The Next Gerenation of Kanji Processing beyond the Character Code Model”
460 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
461 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
462 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
463 “Software Review: CHISE Project,”
464 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
465 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
466 <!-- <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA, Tomohiko</a>,
467 <a href="papers/dc2002.pdf">
468 「ポスト文字コード時代の文書処理技術に関する展望」</a>、
470 (全国文献・情報センター人文社会科学学術セミナーシリーズ No.12),
472 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">Morioka, Tomohiko</a>,
473 <a href="http://ya.sakura.ne.jp/~moro/">Moro, Shigeki</a>.
474 “Moji-sosei ni motozuku moji-shori
475 (Character Processing based on Character Features).”
476 <cite><a href="http://www.ipsj.or.jp/members/SIGNotes/Jpn/17/2004/062/"
477 >IPSJ SIG Technical Report Vol. 2004, No. 58 (2004-CH-62)</a></cite>.
478 May, 2004. pp. 53-60. (in Japanese)</li>
483 <h2><a href="history">History</a></h2>
485 This project was assisted by <a
486 href="http://www.ipa.go.jp/NBP/13nendo/13mito/koubo13.html">IPA Exploratory
487 Software Project, 2001</a>.
492 <b>[<a href="../">Above</a>]</b>
494 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
499 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies (DICCS)</a>,
500 <a href="http://www.zinbun.kyoto-u.ac.jp/">Institute for Research in the Humanities</a>,
501 <a href="http://www.kyoto-u.ac.jp/">Kyoto University</a>
504 <a href="http://www.m17n.org/">m17n.org (the Organization for Multilingualization)</a>
505 <a href="http://www.aist.go.jp/">(National Institute of Advanced Industrial Science and Technology)</a>
509 <a href="http://www.hanazono.ac.jp/">Hanazono University</a>
512 <a href="http://www.aist.go.jp/">National Institute of Advanced Industrial Science and Technology</a>
515 <a href="http://bioinfo.tmd.ac.jp/">Dept. of Bioinformatics</a>,
516 <a href="http://www.tmd.ac.jp/mri/mri.html">Medical Research Institute</a>,
517 <a href="http://www.tmd.ac.jp/">Tokyo Medical and Dental University</a>
523 Last modified: Mon May 17 02:42:17 JST 2004
525 <a href="http://www.aurora.dti.ne.jp/~zom/Counter/index.html">
527 src="http://mousai.as.wakwak.ne.jp/cgi-bin/counterp.cgi?projects_chise-en.log"
533 <!-- Keep this comment at the end of the file
537 time-stamp-line-limit:40