1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
4 <title>CHaracter Information Service Environment</title>
5 <link rel=stylesheet href="chise.css" type="text/css">
12 <b><a href="http://www.chise.org/">[chise.org]</a></b>
17 <b><a href="index.html.ja.utf-8">[Japanese page]</a></b>
22 <a href="http://www.chise.org/"><img
24 src="images/diccs-s.jpg" align="middle"></a>
26 <a href="http://cvs.m17n.org/chise/"><img
27 alt="m17n.org" src="images/tomura-s.png" align="middle"></a>
28 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/"><img
29 alt="kanji.zinbun.kyoto-u.ac.jp" src="images/diccs-s.jpg" align="middle"></a>
30 <a href="http://mousai.as.wakwak.ne.jp/projects/chise/"><img
31 alt="mousai.as.wakwak.ne.jp" src="images/egret-pond-s.jpg"
38 <h1>CHISE project</h1>
42 <!--<b><a href="index.html.ja.utf-8"><img
43 src="images/japanese-page.png">
47 <h2>About the CHISE Project</h2>
49 The CHISE (CHaracter Information Service Environment) project attempts
50 to collect and organize into a Knowledge-Base information about
51 characters in the scripts of the world. A new processing environment
52 based on this architecture is currently under development.
57 <!-- <li>Koichi Kamichi has published -->
58 <!-- <a href="http://fonts.jp/chise_linkmap/">chise_linkmap -->
59 <!-- (a visualization system for CHISE character database)</a>, -->
60 <!-- <a href="http://fonts.jp/chise_swig_perl/">chise_swig_perl -->
61 <!-- (a libchise wrapper for perl 5)</a> and -->
62 <!-- <a href="http://fonts.jp/makettf/">makettf -->
63 <!-- (simple TTF binder)</a>, which were results of -->
64 <!-- <a href="News/20051013-15.html">CHISE Conference 2005 -->
65 <!-- and CodeFest Kyoto 2005</a>.</li> -->
66 <!-- <li><a href="News/20051013-15.html">CHISE Conference -->
67 <!-- 2005</a> has been held this October 13 (Thu), 14 (Fri) -->
68 <!-- at <a href="http://www.kcif.or.jp/en/">Kyoto International -->
69 <!-- Community House</a>.</li> -->
70 <!-- <li><a href="http://mousai.kanji.zinbun.kyoto-u.ac.jp/ids-find"> -->
71 <!-- CHISE-IDS Hanzi/Hanja/Kanji Searcher -->
72 <!-- </a>has been published.</li> -->
73 <!-- <!-- <li>2004-06-09 (Wed) -->
74 <!-- Tomohiko Morioka will make a presentation on CHISE Project in -->
75 <!-- <a href="http://kura.hanazono.ac.jp/kanji/20040609symposium.html" -->
76 <!-- >Symposium: <i>Frontier of Character Information Processing: -->
77 <!-- Past, Presenta and Future</i></a>.</li> -->
78 <!-- <li>2004-05-28 (Fri) -->
79 <!-- A presentation on CHISE Project was made in -->
80 <!-- <a href="http://www.sigch.soken.ac.jp/2004.05/">the 62nd meeting of -->
81 <!-- the IPSJ SIG Computers and the Humanities</a>.</li> -->
82 <!-- <li>2003-11-28 (Fri), 29 (Sat) -->
83 <!-- <a href="http://coe21.zinbun.kyoto-u.ac.jp/ws-type-2003">Glyph -->
84 <!-- and Typesetting Workshop</a> was held at -->
85 <!-- <a href="http://www.kcif.or.jp/jp/footer/05.html" -->
86 <!-- >Kyoto City International Foundation</a>. -->
88 <!-- <li>2003-10-31 (Fri) -->
89 <!-- Presentations on the CHISE project were made in -->
90 <!-- <a href="http://lc.linux.or.jp/lc2003/index.html">Linux Conference -->
99 The CHISE project is the aggregate of the following sub-projects.
103 <li>Development of a character processing architecture based on a
104 character knowledge base
105 <!--文字知識データベースに基づく文字処理アーキテクチャの開発-->
108 <li><a href="xemacs/index.html">XEmacs CHISE</a>
109 <li><a href="ruby/index.html">Ruby/CHISE</a>
110 <li><a href="perl/index.html">Perl/CHISE</a>
112 <li><a href="http://fonts.jp/chise_swig_perl/"
117 <li>Concord: development of a prototyping OOP database engine
118 <li><a href="topicmaps/index.html">A TopicMaps based approach to a
120 <!--TopicMapsによる文字知識データベース・システムの開発--></a></li>
121 <li><a href="char-data/">Database of features of characters
122 <!--文字に関するさまざまな知識のデータベース化--></a>
124 <li><a href="ids/index.html">Database of the component structure of
125 Chinese Characters<!--漢字構造情報データベース--></a></li>
127 <li><a href="http://chise.zinbun.kyoto-u.ac.jp/ids-find"
130 <li>Database about variants and related characters
132 <li><a href="http://fonts.jp/chise_linkmap/"
135 <li><a href="glyph/index.html">Intgegration and Composition of
136 Character Glyphs and Styles<!--グリフ・字形情報の統合と合
139 <li><a href="http://fonts.jp/makettf/">makettf</a>
144 <li><a href="visualization/index.html">Mathematical analysis and visualation
145 of character knowledge<!--文字知識情報の数理的解析と可視化--></a></li>
146 <li><a href="omega/index.html">Omega/CHISE: Typesetting System in cooperation
147 with character knowledge database
148 <!--文字データベースと連携した組版システム--></a></li>
149 <li>CHISE-core / CHISE-base: integrated package and installer
153 <h2>Development of a character processing architecture based on a
154 character knowledge base</h2>
155 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p>
156 It is now possible to load character
157 attributes from a external database on demand ("lazy loading"). On
158 Intel 32 bit processor architectures, the size of the executable file
159 thus shrinks from the 30 MB required with the traditional built to
160 just about 15 MB. This can now be downloaded from <a
161 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
162 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
163 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
164 accessed by anonymous CVS </p>
166 <h2>A <a name="topicmaps">
167 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
171 In 2001 the prototype of a Topic Map engine has been developed based
172 on <a href="http://www.zope.org/">Zope</a>. This proved less than
173 ideal for this purpose, so the focus for this year is to port this
174 engine to a relational database backend. Currently development
175 continued with PostgreSQL. It is planned to enable Topic Map editing
176 within XEmacs UTF-2000, but also to allow multiple clients in addtion
180 <h2>Database of features of characters</h2>
182 <h3>Database of the component structure of Chinese Characters</h3>
185 Based on the Ideographic Description Characters (IDS) in
186 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
187 that expresses the structure of Chinese Characters using this syntax.
188 At the moment, we are using the characters in the Unicode tables as a
189 reference. The basic <emph>CJK Unified Ideographs</emph>, as well as
190 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
191 than 70000 characters are currently covered.
195 <a href="images/ids-ext-b-1.png">
196 <img align="ids" src="images/ids-ext-b-1-s.png">
198 Table of the component structure database
203 The following tables are currently available via anonymous CVS from <a
204 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
205 href="http://cvs.m17n.org/viewcvs/?cvsroot=chise">/cvs/chise</a>
207 href="http://cvs.m17n.org/viewcvs/ids/?cvsroot=chise">ids:</a>
213 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
216 <dd>CJK Unified Ideographs (U+4E00 〜 U+9FA5) of ISO/IEC
220 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
223 <dd>CJK Unified Ideographs Extension A (U+3400 〜 U+4DB5, U+FA1F and
224 U+FA23) of ISO/IEC 10646-1:2000
227 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
230 <dd>CJK Compatibility Ideographs (U+F900 〜 U+FA2D, except U+FA1F
231 and U+FA23) of ISO/IEC 10646-1:2000
234 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
237 <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 〜
238 U-00021FFF) of ISO/IEC 10646-2:2001
241 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
244 <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 〜
245 U-00023FFF) of ISO/IEC 10646-2:2001
247 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
250 <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 〜
251 U-00025FFF) of ISO/IEC 10646-2:2001
253 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
256 <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 〜
257 U-00027FFF) of ISO/IEC 10646-2:2001
259 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
262 <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 〜
263 U-00029FFF) of ISO/IEC 10646-2:2001
265 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
268 <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 〜
269 U-0002A6D6) of ISO/IEC 10646-2:2001
271 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
272 IDS-UCS-Compat-Supplement.txt
274 <dd>CJK Compatibility Ideographs Supplement (U-0002F800 〜
275 U-0002FA1D) of ISO/IEC 10646-2:2001
277 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
280 <dd>Morohashi: Daikanwa Jiten, Volume 1
282 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
285 <dd>Morohashi: Daikanwa Jiten, Volume 2
287 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
290 <dd>Morohashi: Daikanwa Jiten, Volume 3
292 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
295 <dd>Morohashi: Daikanwa Jiten, Volume 4
297 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
300 <dd>Morohashi: Daikanwa Jiten, Volume 5
302 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
305 <dd>Morohashi: Daikanwa Jiten, Volume 6
307 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
310 <dd>Morohashi: Daikanwa Jiten, Volume 7
312 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
315 <dd>Morohashi: Daikanwa Jiten, Volume 8
317 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
320 <dd>Morohashi: Daikanwa Jiten, Volume 9
322 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
325 <dd>Morohashi: Daikanwa Jiten, Volume 10
327 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
330 <dd>Morohashi: Daikanwa Jiten, Volume 11
332 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
335 <dd>Morohashi: Daikanwa Jiten, Volume 12
337 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
340 <dd>Morohashi: Daikanwa Jiten, Additions
342 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
345 <dd>Morohashi: Daikanwa Jiten, Appendix
347 href="http://cvs.m17n.org/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
350 <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
351 Association (CBETA)</a>
356 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
357 (<a href="http://www.fonts.jp/">
358 Forum for development of on-the-fly generation of Kanji Fonts
360 <a href="http://www.fonts.jp/search.html">
361 Analytic tool for Kanji Fonts (in Japanese)
365 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
366 and Styles</a></h3> <p> In the character database is information about
367 character glyphs and styles collected. This allows to use this
368 information together with the other knowledge about a character in the
369 database to built a system that uses the <a href="#ids">component
370 structure information </a> to assemble the font for a character
371 depending on the contextual requirements from its components. With
372 this system, occurrences of mismatches based on erroneous association
373 or insufficient contextual information are excluded, and it will be
374 possible easily display and print character forms that have not been codified and for
375 which no fonts exists .
378 <a href="http://www.fonts.jp/">
379 Forum for development of on-the-fly generation of Kanji Fonts
384 <h3><a name="network">Mathematical analysis and visualation of
385 character knowledge</a></h3>
387 <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
389 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
390 Network of Words</a>”, <a href="http://arob.cc.oita-u.ac.jp/">
391 Artificial Life and Robotics 2002</a>
392 (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
393 Presentation material
395 <li>Model for the relation of Kanji characters that share a component
398 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
400 src="images/mage1-s.jpg"><br>Image 1</a>
402 <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
404 src="images/mage2-s.jpg"><br>Image 2</a>
409 <h2>CVS Repository</h2>
411 <a href="http://cvs.m17n.org/viewcvs/?cvsroot=chise">/cvs/chise</a>
415 <h2>Mailing List</h2>
417 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
419 Anybody who would like to take part in the discussion about and
420 development of the CHISE Project, has ideas or questions about the
421 implementation or wishes for new features is welcome to join either
422 the English, or the Japanese or both lists.
424 To become a member in the CHISE mailing, send a message to the
428 <dd><a href="mailto:chise-ja-ctl@m17n.org">
429 chise-ja-ctl@m17n.org</a>
432 <dd><a href="mailto:chise-en-ctl@m17n.org">
433 chise-en-ctl@m17n.org</a>
437 <blockquote>subscribe Your Name</blockquote>
438 in the body of the message. You will then receive a conformation
439 message with the line
442 confirm PASSWORD Your Name
443 </blockquote> You will have to reply to this message to become a member.
447 <h2>Papers and Presentations</h2>
449 <li><a href="xemacs/#presentation">
450 About XEmacs UTF-2000</a>
451 <li><a href="#network">About mathematical analysis of Character Information</a>
454 <li><a href="papers/u2k-plan.ja/">
455 “Model and Implementation of a Next Generation Multilingual
456 Processing System”
457 </a> (in Japanese. October 1999)
458 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
459 “Non-system characters in XML documents”, in:
460 <i>The Frontier of Asian Information Processing</i>
461 [Seminar Series of the National Documentation and
462 Information Centers in Humanities] No. 10, November 2000
463 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
464 “The UTF-2000 Project”, in:
466 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
467 Kanji and Information, No.2</a>, March 2001 (in Japanese)
468 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
469 “CHISE project &emdash; beyond the UTF-2000”,
470 <a href="http://www.m17n.org/m17n2001/">
471 m17n2001: the Fifth International Symposium on Multilingual
472 Information Processing and Open Source Software
474 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
475 “A Short Introduction to UTF-2000 Project”,
476 the First TEI Character Set Issues Working Group (October 2001,
477 University of California, Berkeley, USA).
478 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
479 “What is Digitisation?”, in:
481 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
482 Kanji and Information, No.3</a>, October 2001 (in Japanese).
483 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
484 “The meaning of 'beyond character codes'”, in:
486 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
487 Kanji and Information, No.3</a>, October 2001 (in Japanese).
488 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
489 “Some thoughts on the digitization of Kanji”,
490 <i>Information Technology and the Humanities</i>
491 [Seminar Series of the National Documentation and
492 Information Centers in Humanities] No. 11, November 2001.
493 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">KAMICHI, Koichi</a>,
494 “Building KAGE (Kanji-font Automatic Generating Engine):
495 The Next Gerenation of Kanji Processing beyond the Character Code Model”
496 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
497 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
498 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
499 “Software Review: CHISE Project,”
500 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
501 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
502 <!-- <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA, Tomohiko</a>,
503 <a href="papers/dc2002.pdf">
504 「ポスト文字コード時代の文書処理技術に関する展望」</a>、
506 (全国文献・情報センター人文社会科学学術セミナーシリーズ No.12),
508 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">Morioka, Tomohiko</a>,
509 <a href="http://ya.sakura.ne.jp/~moro/">Moro, Shigeki</a>.
510 “Moji-sosei ni motozuku moji-shori
511 (Character Processing based on Character Features).”
512 <cite><a href="http://www.ipsj.or.jp/members/SIGNotes/Jpn/17/2004/062/"
513 >IPSJ SIG Technical Report Vol. 2004, No. 58 (2004-CH-62)</a></cite>.
514 May, 2004. pp. 53-60. (in Japanese)</li>
519 <h2><a href="history">History</a></h2>
521 This project was assisted by <a
522 href="http://www.ipa.go.jp/NBP/13nendo/13mito/koubo13.html">IPA Exploratory
523 Software Project, 2001</a>.
528 <b>[<a href="../">Above</a>]</b>
530 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
535 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies (DICCS)</a>,
536 <a href="http://www.zinbun.kyoto-u.ac.jp/">Institute for Research in the Humanities</a>,
537 <a href="http://www.kyoto-u.ac.jp/">Kyoto University</a>
540 <a href="http://www.m17n.org/">m17n.org (the Organization for Multilingualization)</a>
541 <a href="http://www.aist.go.jp/">(National Institute of Advanced Industrial Science and Technology)</a>
545 <a href="http://www.hanazono.ac.jp/">Hanazono University</a>
548 <a href="http://www.aist.go.jp/">National Institute of Advanced Industrial Science and Technology</a>
551 <a href="http://bioinfo.tmd.ac.jp/">Dept. of Bioinformatics</a>,
552 <a href="http://www.tmd.ac.jp/mri/mri.html">Medical Research Institute</a>,
553 <a href="http://www.tmd.ac.jp/">Tokyo Medical and Dental University</a>
559 Last modified: Tue Sep 21 19:50:46 JST 2010
561 <a href="http://www.aurora.dti.ne.jp/~zom/Counter/index.html">
563 src="http://mousai.as.wakwak.ne.jp/cgi-bin/counterp.cgi?projects_chise-en.log"