1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
4 <title>CHaracter Information Service Environment</title>
5 <link rel=stylesheet href="chise.css" type="text/css">
15 <b><a href="index.html.ja.iso-2022-jp">[Japanese page]</a></b>
20 <a href="http://cvs.m17n.org/chise/"><img
21 alt="m17n.org" src="images/tomura-s.png" align="middle"></a>
22 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/"><img
23 alt="kanji.zinbun.kyoto-u.ac.jp" src="images/diccs-s.jpg" align="middle"></a>
24 <a href="http://mousai.as.wakwak.ne.jp/projects/chise/"><img
25 alt="mousai.as.wakwak.ne.jp" src="images/egret-pond-s.jpg"
32 <h1>CHISE project</h1>
36 <!--<b><a href="index.html.ja.iso-2022-jp"><img
37 src="images/japanese-page.png">
41 <h2>About the CHISE Project</h2>
43 The CHISE (CHaracter Information Service Environment) project attempts
44 to collect and organize into a Knowledge-Base information about
45 characters in the scripts of the world. A new processing environment
46 based on this architecture is currently under development.
51 <li>Koichi Kamichi has published
52 <a href="http://fonts.jp/chise_linkmap/">chise_linkmap
53 (a visualization system for CHISE character database)</a>,
54 <a href="http://fonts.jp/chise_swig_perl/">chise_swig_perl
55 (a libchise wrapper for perl 5)</a> and
56 <a href="http://fonts.jp/makettf/">makettf
57 (simple TTF binder)</a>, which were results of
58 <a href="News/20051013-15.html">CHISE Conference 2005
59 and CodeFest Kyoto 2005</a>.</li>
60 <li><a href="News/20051013-15.html">CHISE Conference
61 2005</a> has been held this October 13 (Thu), 14 (Fri)
62 at <a href="http://www.kcif.or.jp/en/">Kyoto International
63 Community House</a>.</li>
64 <li><a href="http://mousai.kanji.zinbun.kyoto-u.ac.jp/ids-find">
65 CHISE-IDS Hanzi/Hanja/Kanji Searcher
66 </a>has been published.</li>
67 <!-- <li>2004-06-09 (Wed)
68 Tomohiko Morioka will make a presentation on CHISE Project in
69 <a href="http://kura.hanazono.ac.jp/kanji/20040609symposium.html"
70 >Symposium: <i>Frontier of Character Information Processing:
71 Past, Presenta and Future</i></a>.</li>
73 A presentation on CHISE Project was made in
74 <a href="http://www.sigch.soken.ac.jp/2004.05/">the 62nd meeting of
75 the IPSJ SIG Computers and the Humanities</a>.</li>
76 <li>2003-11-28 (Fri), 29 (Sat)
77 <a href="http://coe21.zinbun.kyoto-u.ac.jp/ws-type-2003">Glyph
78 and Typesetting Workshop</a> was held at
79 <a href="http://www.kcif.or.jp/jp/footer/05.html"
80 >Kyoto City International Foundation</a>.
82 <!-- <li>2003-10-31 (Fri) -->
83 <!-- Presentations on the CHISE project were made in -->
84 <!-- <a href="http://lc.linux.or.jp/lc2003/index.html">Linux Conference -->
93 The CHISE project is the aggregate of the following sub-projects.
97 <li>Development of a character processing architecture based on a
98 character knowledge base
99 <!--文字知識データベースに基づく文字処理アーキテクチャの開発-->
102 <li><a href="xemacs/index.html">XEmacs CHISE</a>
103 <li><a href="ruby/index.html">Ruby/CHISE</a>
104 <li><a href="perl/index.html">Perl/CHISE</a>
106 <li><a href="http://fonts.jp/chise_swig_perl/"
111 <li>Concord: development of a prototyping OOP database engine
112 <li><a href="topicmaps/index.html">A TopicMaps based approach to a
114 <!--TopicMapsによる文字知識データベース・システムの開発--></a></li>
115 <li><a href="char-data/">Database of features of characters
116 <!--文字に関するさまざまな知識のデータベース化--></a>
118 <li><a href="ids/index.html">Database of the component structure of
119 Chinese Characters<!--漢字構造情報データベース--></a></li>
121 <li><a href="http://mousai.kanji.zinbun.kyoto-u.ac.jp/ids-find"
124 <li>Database about variants and related characters
126 <li><a href="http://fonts.jp/chise_linkmap/"
129 <li><a href="glyph/index.html">Intgegration and Composition of
130 Character Glyphs and Styles<!--グリフ・字形情報の統合と合
133 <li><a href="http://fonts.jp/makettf/">makettf</a>
138 <li><a href="visualization/index.html">Mathematical analysis and visualation
139 of character knowledge<!--文字知識情報の数理的解析と可視化--></a></li>
140 <li><a href="omega/index.html">Omega/CHISE: Typesetting System in cooperation
141 with character knowledge database
142 <!--文字データベースと連携した組版システム--></a></li>
143 <li>CHISE-core / CHISE-base: integrated package and installer
147 <h2>Development of a character processing architecture based on a
148 character knowledge base</h2>
149 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p>
150 It is now possible to load character
151 attributes from a external database on demand ("lazy loading"). On
152 Intel 32 bit processor architectures, the size of the executable file
153 thus shrinks from the 30 MB required with the traditional built to
154 just about 15 MB. This can now be downloaded from <a
155 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
156 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
157 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
158 accessed by anonymous CVS </p>
160 <h2>A <a name="topicmaps">
161 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
165 In 2001 the prototype of a Topic Map engine has been developed based
166 on <a href="http://www.zope.org/">Zope</a>. This proved less than
167 ideal for this purpose, so the focus for this year is to port this
168 engine to a relational database backend. Currently development
169 continued with PostgreSQL. It is planned to enable Topic Map editing
170 within XEmacs UTF-2000, but also to allow multiple clients in addtion
174 <h2>Database of features of characters</h2>
176 <h3>Database of the component structure of Chinese Characters</h3>
179 Based on the Ideographic Description Characters (IDS) in
180 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
181 that expresses the structure of Chinese Characters using this syntax.
182 At the moment, we are using the characters in the Unicode tables as a
183 reference. The basic <emph>CJK Unified Ideographs</emph>, as well as
184 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
185 than 70000 characters are currently covered.
189 <a href="images/ids-ext-b-1.png">
190 <img align="ids" src="images/ids-ext-b-1-s.png">
192 Table of the component structure database
197 The following tables are currently available via anonymous CVS from <a
198 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
199 href="http://cvs.m17n.org/viewcvs/?cvsroot=chise">/cvs/chise</a>
201 href="http://cvs.m17n.org/viewcvs/ids/?cvsroot=chise">ids:</a>
207 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
210 <dd>CJK Unified Ideographs (U+4E00 〜 U+9FA5) of ISO/IEC
214 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
217 <dd>CJK Unified Ideographs Extension A (U+3400 〜 U+4DB5, U+FA1F and
218 U+FA23) of ISO/IEC 10646-1:2000
221 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
224 <dd>CJK Compatibility Ideographs (U+F900 〜 U+FA2D, except U+FA1F
225 and U+FA23) of ISO/IEC 10646-1:2000
228 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
231 <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 〜
232 U-00021FFF) of ISO/IEC 10646-2:2001
235 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
238 <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 〜
239 U-00023FFF) of ISO/IEC 10646-2:2001
241 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
244 <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 〜
245 U-00025FFF) of ISO/IEC 10646-2:2001
247 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
250 <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 〜
251 U-00027FFF) of ISO/IEC 10646-2:2001
253 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
256 <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 〜
257 U-00029FFF) of ISO/IEC 10646-2:2001
259 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
262 <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 〜
263 U-0002A6D6) of ISO/IEC 10646-2:2001
265 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
266 IDS-UCS-Compat-Supplement.txt
268 <dd>CJK Compatibility Ideographs Supplement (U-0002F800 〜
269 U-0002FA1D) of ISO/IEC 10646-2:2001
271 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
274 <dd>Morohashi: Daikanwa Jiten, Volume 1
276 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
279 <dd>Morohashi: Daikanwa Jiten, Volume 2
281 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
284 <dd>Morohashi: Daikanwa Jiten, Volume 3
286 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
289 <dd>Morohashi: Daikanwa Jiten, Volume 4
291 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
294 <dd>Morohashi: Daikanwa Jiten, Volume 5
296 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
299 <dd>Morohashi: Daikanwa Jiten, Volume 6
301 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
304 <dd>Morohashi: Daikanwa Jiten, Volume 7
306 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
309 <dd>Morohashi: Daikanwa Jiten, Volume 8
311 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
314 <dd>Morohashi: Daikanwa Jiten, Volume 9
316 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
319 <dd>Morohashi: Daikanwa Jiten, Volume 10
321 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
324 <dd>Morohashi: Daikanwa Jiten, Volume 11
326 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
329 <dd>Morohashi: Daikanwa Jiten, Volume 12
331 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
334 <dd>Morohashi: Daikanwa Jiten, Additions
336 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
339 <dd>Morohashi: Daikanwa Jiten, Appendix
341 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
344 <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
345 Association (CBETA)</a>
350 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
351 (<a href="http://www.fonts.jp/">
352 Forum for development of on-the-fly generation of Kanji Fonts
354 <a href="http://www.fonts.jp/search.html">
355 Analytic tool for Kanji Fonts (in Japanese)
359 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
360 and Styles</a></h3> <p> In the character database is information about
361 character glyphs and styles collected. This allows to use this
362 information together with the other knowledge about a character in the
363 database to built a system that uses the <a href="#ids">component
364 structure information </a> to assemble the font for a character
365 depending on the contextual requirements from its components. With
366 this system, occurrences of mismatches based on erroneous association
367 or insufficient contextual information are excluded, and it will be
368 possible easily display and print character forms that have not been codified and for
369 which no fonts exists .
372 <a href="http://www.fonts.jp/">
373 Forum for development of on-the-fly generation of Kanji Fonts
378 <h3><a name="network">Mathematical analysis and visualation of
379 character knowledge</a></h3>
381 <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
383 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
384 Network of Words</a>”, <a href="http://arob.cc.oita-u.ac.jp/">
385 Artificial Life and Robotics 2002</a>
386 (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
387 Presentation material
389 <li>Model for the relation of Kanji characters that share a component
392 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
394 src="images/mage1-s.jpg"><br>Image 1</a>
396 <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
398 src="images/mage2-s.jpg"><br>Image 2</a>
403 <h2>CVS Repository</h2>
405 <a href="http://cvs.m17n.org/viewcvs/?cvsroot=chise">/cvs/chise</a>
409 <h2>Mailing List</h2>
411 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
413 Anybody who would like to take part in the discussion about and
414 development of the CHISE Project, has ideas or questions about the
415 implementation or wishes for new features is welcome to join either
416 the English, or the Japanese or both lists.
418 To become a member in the CHISE mailing, send a message to the
422 <dd><a href="mailto:chise-ja-ctl@m17n.org">
423 chise-ja-ctl@m17n.org</a>
426 <dd><a href="mailto:chise-en-ctl@m17n.org">
427 chise-en-ctl@m17n.org</a>
431 <blockquote>subscribe Your Name</blockquote>
432 in the body of the message. You will then receive a conformation
433 message with the line
436 confirm PASSWORD Your Name
437 </blockquote> You will have to reply to this message to become a member.
441 <h2>Papers and Presentations</h2>
443 <li><a href="xemacs/#presentation">
444 About XEmacs UTF-2000</a>
445 <li><a href="#network">About mathematical analysis of Character Information</a>
448 <li><a href="papers/u2k-plan.ja/">
449 “Model and Implementation of a Next Generation Multilingual
450 Processing System”
451 </a> (in Japanese. October 1999)
452 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
453 “Non-system characters in XML documents”, in:
454 <i>The Frontier of Asian Information Processing</i>
455 [Seminar Series of the National Documentation and
456 Information Centers in Humanities] No. 10, November 2000
457 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
458 “The UTF-2000 Project”, in:
460 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
461 Kanji and Information, No.2</a>, March 2001 (in Japanese)
462 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
463 “CHISE project &emdash; beyond the UTF-2000”,
464 <a href="http://www.m17n.org/m17n2001/">
465 m17n2001: the Fifth International Symposium on Multilingual
466 Information Processing and Open Source Software
468 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
469 “A Short Introduction to UTF-2000 Project”,
470 the First TEI Character Set Issues Working Group (October 2001,
471 University of California, Berkeley, USA).
472 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
473 “What is Digitisation?”, in:
475 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
476 Kanji and Information, No.3</a>, October 2001 (in Japanese).
477 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
478 “The meaning of 'beyond character codes'”, in:
480 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
481 Kanji and Information, No.3</a>, October 2001 (in Japanese).
482 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
483 “Some thoughts on the digitization of Kanji”,
484 <i>Information Technology and the Humanities</i>
485 [Seminar Series of the National Documentation and
486 Information Centers in Humanities] No. 11, November 2001.
487 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">KAMICHI, Koichi</a>,
488 “Building KAGE (Kanji-font Automatic Generating Engine):
489 The Next Gerenation of Kanji Processing beyond the Character Code Model”
490 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
491 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
492 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
493 “Software Review: CHISE Project,”
494 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
495 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
496 <!-- <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA, Tomohiko</a>,
497 <a href="papers/dc2002.pdf">
498 「ポスト文字コード時代の文書処理技術に関する展望」</a>、
500 (全国文献・情報センター人文社会科学学術セミナーシリーズ No.12),
502 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">Morioka, Tomohiko</a>,
503 <a href="http://ya.sakura.ne.jp/~moro/">Moro, Shigeki</a>.
504 “Moji-sosei ni motozuku moji-shori
505 (Character Processing based on Character Features).”
506 <cite><a href="http://www.ipsj.or.jp/members/SIGNotes/Jpn/17/2004/062/"
507 >IPSJ SIG Technical Report Vol. 2004, No. 58 (2004-CH-62)</a></cite>.
508 May, 2004. pp. 53-60. (in Japanese)</li>
513 <h2><a href="history">History</a></h2>
515 This project was assisted by <a
516 href="http://www.ipa.go.jp/NBP/13nendo/13mito/koubo13.html">IPA Exploratory
517 Software Project, 2001</a>.
522 <b>[<a href="../">Above</a>]</b>
524 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
529 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies (DICCS)</a>,
530 <a href="http://www.zinbun.kyoto-u.ac.jp/">Institute for Research in the Humanities</a>,
531 <a href="http://www.kyoto-u.ac.jp/">Kyoto University</a>
534 <a href="http://www.m17n.org/">m17n.org (the Organization for Multilingualization)</a>
535 <a href="http://www.aist.go.jp/">(National Institute of Advanced Industrial Science and Technology)</a>
539 <a href="http://www.hanazono.ac.jp/">Hanazono University</a>
542 <a href="http://www.aist.go.jp/">National Institute of Advanced Industrial Science and Technology</a>
545 <a href="http://bioinfo.tmd.ac.jp/">Dept. of Bioinformatics</a>,
546 <a href="http://www.tmd.ac.jp/mri/mri.html">Medical Research Institute</a>,
547 <a href="http://www.tmd.ac.jp/">Tokyo Medical and Dental University</a>
552 <!-- hhmts start --> Last modified: Tue Oct 18 12:40:40 JST 2005 <!-- hhmts end -->.
553 <a href="http://www.aurora.dti.ne.jp/~zom/Counter/index.html">
555 src="http://mousai.as.wakwak.ne.jp/cgi-bin/counterp.cgi?projects_chise-en.log"
561 <!-- Keep this comment at the end of the file
565 time-stamp-line-limit:40