1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
4 <title>CHaracter Information Service Environment</title>
5 <link rel=stylesheet href="chise.css" type="text/css">
15 <b><a href="index.html.ja.iso-2022-jp">[Japanese page]</a></b>
20 <a href="http://cvs.m17n.org/chise/"><img
21 alt="m17n.org" src="images/tomura-s.png" align="middle"></a>
22 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/"><img
23 alt="kanji.zinbun.kyoto-u.ac.jp" src="images/diccs-s.jpg" align="middle"></a>
24 <a href="http://mousai.as.wakwak.ne.jp/projects/chise/"><img
25 alt="mousai.as.wakwak.ne.jp" src="images/egret-pond-s.jpg"
32 <h1>CHISE project</h1>
36 <!--<b><a href="index.html.ja.iso-2022-jp"><img
37 src="images/japanese-page.png">
41 <h2>About the CHISE Project</h2>
43 The CHISE (CHaracter Information Service Environment) project attempts
44 to collect and organize into a Knowledge-Base information about
45 characters in the scripts of the world. A new processing environment
46 based on this architecture is currently under development.
51 <li>2003-11-28 (Fri), 29 (Sat)
52 <a href="http://coe21.zinbun.kyoto-u.ac.jp/ws-type-2003">Glyph
53 and Typesetting Workshop</a> was held at
54 <a href="http://www.kcif.or.jp/jp/footer/05.html"
55 >Kyoto City International Foundation</a>.
58 Presentations on CHISE project was made in
59 <a href="http://lc.linux.or.jp/lc2003/index.html">Linux Conference
68 CHISE project is the aggregate of the following sub-projects.
72 <li>Development of a character processing architecture based on a
73 character knowledge base
74 <!--字知データベースに基づく字処理アーキテクチャの開発-->
76 <li><a href="xemacs/index.html">XEmacs CHISE</a>
77 <li><a href="ruby/index.html">Ruby/CHISE</a>
78 <li><a href="perl/index.html">Perl/CHISE</a>
81 <li><a href="topicmaps/index.html">A TopicMaps based approach to a
83 <!--TopicMapsによる字知データベース・システムの開発--></a></li>
84 <li><a href="char-data/">Database of features of characters
85 <!--字に関するさまざまな知のデータベース--></a>
87 <li><a href="ids/index.html">Database of the component structure of
88 Chinese Characters<!--字構造情報データベース--></a></li>
89 <li><a href="glyph/index.html">Intgegration and Composition of
90 Character Glyphs and Styles<!--グリフ・字形情報の合と合成--></a></li>
93 <li><a href="visualization/index.html">Mathematical analysis and visualation
94 of character knowledge<!--字知情報の数理的析と可--></a></li>
95 <li><a href="omega/index.html">Omega/CHISE: Typesetting System in cooperation
96 with character knowledge database
97 <!--字データベースと連したシステム--></a></li>
101 <h2>Development of a character processing architecture based on a
102 character knowledge base</h2>
103 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p>
104 It is now possible to load character
105 attributes from a external database on demand ("lazy loading"). On
106 Intel 32 bit processor architectures, the size of the executable file
107 thus shrinks from the 30 MB required with the traditional built to
108 just about 15 MB. This can now be downloaded from <a
109 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
110 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
111 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
112 accessed by anonymous CVS </p>
114 <h2>A <a name="topicmaps">
115 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
119 In 2001 the prototype of a Topic Map engine has been developed based
120 on <a href="http://www.zope.org/">Zope</a>. This proved less than
121 ideal for this purpose, so the focus for this year is to port this
122 engine to a relational database backend. Currently development
123 continued with PostgreSQL. It is planned to enable Topic Map editing
124 within XEmacs UTF-2000, but also to allow multiple clients in addtion
128 <h2>Database of features of characters</h2>
130 <h3>Database of the component structure of Chinese Characters</h3>
133 Based on the Ideographic Description Characters (IDS) in
134 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
135 that expresses the structure of Chinese Characters using this syntax.
136 At the moment, we are using the characters in the Unicode tables as a
137 reference. The basic <emph>CJK Unified Ideographs</emph>, as well as
138 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
139 than 70000 characters are currently covered.
143 <a href="images/ids-ext-b-1.png">
144 <img align="ids" src="images/ids-ext-b-1-s.png">
146 Table of the component structure database
151 The following tables are currently available via anonymous CVS from <a
152 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
153 href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
155 href="http://cvs.m17n.org/cgi-bin/viewcvs/ids/?cvsroot=chise">ids:</a>
161 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
164 <dd>CJK Unified Ideographs (U+4E00 〜 U+9FA5) of ISO/IEC
168 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
171 <dd>CJK Unified Ideographs Extension A (U+3400 〜 U+4DB5, U+FA1F and
172 U+FA23) of ISO/IEC 10646-1:2000
175 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
178 <dd>CJK Compatibility Ideographs (U+F900 〜 U+FA2D, except U+FA1F
179 and U+FA23) of ISO/IEC 10646-1:2000
182 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
185 <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 〜
186 U-00021FFF) of ISO/IEC 10646-2:2001
189 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
192 <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 〜
193 U-00023FFF) of ISO/IEC 10646-2:2001
195 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
198 <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 〜
199 U-00025FFF) of ISO/IEC 10646-2:2001
201 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
204 <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 〜
205 U-00027FFF) of ISO/IEC 10646-2:2001
207 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
210 <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 〜
211 U-00029FFF) of ISO/IEC 10646-2:2001
213 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
216 <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 〜
217 U-0002A6D6) of ISO/IEC 10646-2:2001
219 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
220 IDS-UCS-Compat-Supplement.txt
222 <dd>CJK Compatibility Ideographs Supplement (U-0002F800 〜
223 U-0002FA1D) of ISO/IEC 10646-2:2001
225 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
228 <dd>Morohashi: Daikanwa Jiten, Volume 1
230 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
233 <dd>Morohashi: Daikanwa Jiten, Volume 2
235 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
238 <dd>Morohashi: Daikanwa Jiten, Volume 3
240 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
243 <dd>Morohashi: Daikanwa Jiten, Volume 4
245 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
248 <dd>Morohashi: Daikanwa Jiten, Volume 5
250 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
253 <dd>Morohashi: Daikanwa Jiten, Volume 6
255 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
258 <dd>Morohashi: Daikanwa Jiten, Volume 7
260 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
263 <dd>Morohashi: Daikanwa Jiten, Volume 8
265 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
268 <dd>Morohashi: Daikanwa Jiten, Volume 9
270 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
273 <dd>Morohashi: Daikanwa Jiten, Volume 10
275 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
278 <dd>Morohashi: Daikanwa Jiten, Volume 11
280 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
283 <dd>Morohashi: Daikanwa Jiten, Volume 12
285 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
288 <dd>Morohashi: Daikanwa Jiten, Additions
290 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
293 <dd>Morohashi: Daikanwa Jiten, Appendix
295 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
298 <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
299 Association (CBETA)</a>
304 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
305 (<a href="http://www.fonts.jp/">
306 Forum for development of on-the-fly generation of Kanji Fonts
308 <a href="http://www.fonts.jp/search.html">
309 Analytic tool for Kanji Fonts (in Japanese)
313 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
314 and Styles</a></h3> <p> In the character database is information about
315 character glyphs and styles collected. This allows to use this
316 information together with the other knowledge about a character in the
317 database to built a system that uses the <a href="#ids">component
318 structure information </a> to assemble the font for a character
319 depending on the contextual requirements from its components. With
320 this system, occurrences of mismatches based on erroneous association
321 or insufficient contextual information are excluded, and it will be
322 possible easily display and print character forms that have not been codified and for
323 which no fonts exists .
326 <a href="http://www.fonts.jp/">
327 Forum for development of on-the-fly generation of Kanji Fonts
332 <h3><a name="network">Mathematical analysis and visualation of
333 character knowledge</a></h3>
335 <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
337 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
338 Network of Words</a>”, <a href="http://arob.cc.oita-u.ac.jp/">
339 Artificial Life and Robotics 2002</a>
340 (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
341 Presentation material
343 <li>Model for the relation of Kanji characters that share a component
346 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
348 src="images/mage1-s.jpg"><br>Image 1</a>
350 <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
352 src="images/mage2-s.jpg"><br>Image 2</a>
357 <h2>CVS Repository</h2>
359 <a href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
363 <h2>Mailing List</h2>
365 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
367 Anybody who would like to take part in the discussion about and
368 development of the CHISE Project, has ideas or questions about the
369 implementation or wishes for new features is welcome to join either
370 the English, or the Japanese or both lists.
372 To become a member in the CHISE mailing, send a message to the
376 <dd><a href="mailto:chise-ja-ctl@m17n.org">
377 chise-ja-ctl@m17n.org</a>
380 <dd><a href="mailto:chise-en-ctl@m17n.org">
381 chise-en-ctl@m17n.org</a>
385 <blockquote>subscribe Your Name</blockquote>
386 in the body of the message. You will then receive a conformation
387 message with the line
390 confirm PASSWORD Your Name
391 </blockquote> You will have to reply to this message to become a member.
395 <h2>Papers and Presentations</h2>
397 <li><a href="xemacs/#presentation">
398 About XEmacs UTF-2000</a>
399 <li><a href="#network">About mathematical analysis of Character Information</a>
402 <li><a href="papers/u2k-plan.ja/">
403 “Model and Implementation of a Next Generation Multilingual
404 Processing System”
405 </a> (in Japanese. October 1999)
406 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
407 “Non-system characters in XML documents”, in:
408 <i>The Frontier of Asian Information Processing</i>
409 [Seminar Series of the National Documentation and
410 Information Centers in Humanities] No. 10, November 2000
411 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
412 “The UTF-2000 Project”, in:
414 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
415 Kanji and Information, No.2</a>, March 2001 (in Japanese)
416 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
417 “CHISE project &emdash; beyond the UTF-2000”,
418 <a href="http://www.m17n.org/m17n2001/">
419 m17n2001: the Fifth International Symposium on Multilingual
420 Information Processing and Open Source Software
422 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
423 “A Short Introduction to UTF-2000 Project”,
424 the First TEI Character Set Issues Working Group (October 2001,
425 University of California, Berkeley, USA).
426 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
427 “What is Digitisation?”, in:
429 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
430 Kanji and Information, No.3</a>, October 2001 (in Japanese).
431 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
432 “The meaning of 'beyond character codes'”, in:
434 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
435 Kanji and Information, No.3</a>, October 2001 (in Japanese).
436 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
437 “Some thoughts on the digitization of Kanji”,
438 <i>Information Technology and the Humanities</i>
439 [Seminar Series of the National Documentation and
440 Information Centers in Humanities] No. 11, November 2001.
441 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">KAMICHI, Koichi</a>,
442 “Building KAGE (Kanji-font Automatic Generating Engine):
443 The Next Gerenation of Kanji Processing beyond the Character Code Model”
444 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
445 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
446 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
447 “Software Review: CHISE Project,”
448 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
449 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
450 <!-- <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA, Tomohiko</a>,
451 <a href="papers/dc2002.pdf">
452 「ポスト文字コード時代の文書処理技術に関する展望」</a>、
454 (全国文献・情報センター人文社会科学学術セミナーシリーズ No.12),
459 <h2><a href="history">History</a></h2>
461 This project was assisted by <a
462 href="http://www.ipa.go.jp/NBP/13nendo/13mito/koubo13.html"><span lang="ja">未踏ソフトウェア創造事業</span>, 2001</a>.
467 <b>[<a href="../">Above</a>]</b>
469 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
474 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies (DICCS)</a>,
475 <a href="http://www.zinbun.kyoto-u.ac.jp/">Institute for Research in the Humanities</a>,
476 <a href="http://www.kyoto-u.ac.jp/">Kyoto University</a>
479 <a href="http://www.m17n.org/">m17n.org (the Organization for Multilingualization)</a>
480 <a href="http://www.aist.go.jp/">(National Institute of Advanced Industrial Science and Technology)</a>
484 <a href="http://www.hanazono.ac.jp/">Hanazono University</a>
487 <a href="http://www.aist.go.jp/">National Institute of Advanced Industrial Science and Technology</a>
490 <a href="http://bioinfo.tmd.ac.jp/">Dept. of Bioinformatics</a>,
491 <a href="http://www.tmd.ac.jp/mri/mri.html">Medical Research Institute</a>,
492 <a href="http://www.tmd.ac.jp/">Tokyo Medical and Dental University</a>
498 Last modified: Fri Jan 9 18:57:43 JST 2004
500 <a href="http://www.aurora.dti.ne.jp/~zom/Counter/index.html">
502 src="http://mousai.as.wakwak.ne.jp/cgi-bin/counterp.cgi?projects_chise-en.log"
508 <!-- Keep this comment at the end of the file
512 time-stamp-line-limit:40