1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
4 <title>CHaracter Information Service Environment</title>
5 <link rel=stylesheet href="chise.css" type="text/css">
15 <b><a href="index.html.ja.iso-2022-jp">[Japanese page]</a></b>
20 <a href="http://cvs.m17n.org/chise/"><img
21 alt="m17n.org" src="images/tomura-s.png" align="middle"></a>
22 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/"><img
23 alt="kanji.zinbun.kyoto-u.ac.jp" src="images/diccs-s.jpg" align="middle"></a>
24 <a href="http://mousai.as.wakwak.ne.jp/projects/chise/"><img
25 alt="mousai.as.wakwak.ne.jp" src="images/egret-pond-s.jpg"
32 <h1>CHISE project</h1>
36 <!--<b><a href="index.html.ja.iso-2022-jp"><img
37 src="images/japanese-page.png">
41 <h2>About the CHISE Project</h2>
43 The CHISE (CHaracter Information Service Environment) project attempts
44 to collect and organize into a Knowledge-Base information about
45 characters in the scripts of the world. A new processing environment
46 based on this architecture is currently under development.
52 Tomohiko Morioka will make a presentation on CHISE Project in
53 <a href="http://kura.hanazono.ac.jp/kanji/20040609symposium.html"
54 >Symposium: <i>Frontier of Character Information Processing:
55 Past, Presenta and Future</i></a>.</li>
57 A presentation on CHISE Project will be made in
58 <a href="http://www.sigch.soken.ac.jp/2004.05/">IPSJ
59 SIG Computers and the Humanities</a>.</li>
60 <li>2003-11-28 (Fri), 29 (Sat)
61 <a href="http://coe21.zinbun.kyoto-u.ac.jp/ws-type-2003">Glyph
62 and Typesetting Workshop</a> was held at
63 <a href="http://www.kcif.or.jp/jp/footer/05.html"
64 >Kyoto City International Foundation</a>.
66 <!-- <li>2003-10-31 (Fri) -->
67 <!-- Presentations on the CHISE project were made in -->
68 <!-- <a href="http://lc.linux.or.jp/lc2003/index.html">Linux Conference -->
77 The CHISE project is the aggregate of the following sub-projects.
81 <li>Development of a character processing architecture based on a
82 character knowledge base
83 <!--文字知識データベースに基づく文字処理アーキテクチャの開発-->
85 <li><a href="xemacs/index.html">XEmacs CHISE</a>
86 <li><a href="ruby/index.html">Ruby/CHISE</a>
87 <li><a href="perl/index.html">Perl/CHISE</a>
90 <li><a href="topicmaps/index.html">A TopicMaps based approach to a
92 <!--TopicMapsによる文字知識データベース・システムの開発--></a></li>
93 <li><a href="char-data/">Database of features of characters
94 <!--文字に関するさまざまな知識のデータベース化--></a>
96 <li><a href="ids/index.html">Database of the component structure of
97 Chinese Characters<!--漢字構造情報データベース--></a></li>
98 <li><a href="glyph/index.html">Intgegration and Composition of
99 Character Glyphs and Styles<!--グリフ・字形情報の統合と合成--></a></li>
102 <li><a href="visualization/index.html">Mathematical analysis and visualation
103 of character knowledge<!--文字知識情報の数理的解析と可視化--></a></li>
104 <li><a href="omega/index.html">Omega/CHISE: Typesetting System in cooperation
105 with character knowledge database
106 <!--文字データベースと連携した組版システム--></a></li>
110 <h2>Development of a character processing architecture based on a
111 character knowledge base</h2>
112 <h3><a name="xemacs/">XEmacs UTF-2000</a></h3> <p>
113 It is now possible to load character
114 attributes from a external database on demand ("lazy loading"). On
115 Intel 32 bit processor architectures, the size of the executable file
116 thus shrinks from the 30 MB required with the traditional built to
117 just about 15 MB. This can now be downloaded from <a
118 href="http://www.kanji.zinbun.kyoto-u.ac.jp/projects/chise/dist/XEmacs/xemacs-utf-2000-0.19.tar.gz">
119 XEmacs UTF-2000 0.19 (Koriyama)</a>. In addtion, there is a UTF-2000
120 branch of the XEmacs tree at cvs.m17n.org in /cvs/root, this can be
121 accessed by anonymous CVS </p>
123 <h2>A <a name="topicmaps">
124 <a href="http://www.topicmaps.org">TopicMaps</a> based approach to a
128 In 2001 the prototype of a Topic Map engine has been developed based
129 on <a href="http://www.zope.org/">Zope</a>. This proved less than
130 ideal for this purpose, so the focus for this year is to port this
131 engine to a relational database backend. Currently development
132 continued with PostgreSQL. It is planned to enable Topic Map editing
133 within XEmacs UTF-2000, but also to allow multiple clients in addtion
137 <h2>Database of features of characters</h2>
139 <h3>Database of the component structure of Chinese Characters</h3>
142 Based on the Ideographic Description Characters (IDS) in
143 ISO/IEC 10646-1:2000 and Unicode, we are now developping a database
144 that expresses the structure of Chinese Characters using this syntax.
145 At the moment, we are using the characters in the Unicode tables as a
146 reference. The basic <emph>CJK Unified Ideographs</emph>, as well as
147 <emph>Extension A</emph> and <emph>Extension B</epmph>, together more
148 than 70000 characters are currently covered.
152 <a href="images/ids-ext-b-1.png">
153 <img align="ids" src="images/ids-ext-b-1-s.png">
155 Table of the component structure database
160 The following tables are currently available via anonymous CVS from <a
161 href="http://cvs.m17n.org/">cvs.m17n.org</a> at <a
162 href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
164 href="http://cvs.m17n.org/cgi-bin/viewcvs/ids/?cvsroot=chise">ids:</a>
170 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Basic.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
173 <dd>CJK Unified Ideographs (U+4E00 〜 U+9FA5) of ISO/IEC
177 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-A.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
180 <dd>CJK Unified Ideographs Extension A (U+3400 〜 U+4DB5, U+FA1F and
181 U+FA23) of ISO/IEC 10646-1:2000
184 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
187 <dd>CJK Compatibility Ideographs (U+F900 〜 U+FA2D, except U+FA1F
188 and U+FA23) of ISO/IEC 10646-1:2000
191 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-1.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
194 <dd>CJK Unified Ideographs Extension B [part 1] (U-00020000 〜
195 U-00021FFF) of ISO/IEC 10646-2:2001
198 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-2.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
201 <dd>CJK Unified Ideographs Extension B [part 2] (U-00022000 〜
202 U-00023FFF) of ISO/IEC 10646-2:2001
204 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-3.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
207 <dd>CJK Unified Ideographs Extension B [part 3] (U-00024000 〜
208 U-00025FFF) of ISO/IEC 10646-2:2001
210 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-4.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
213 <dd>CJK Unified Ideographs Extension B [part 4] (U-00026000 〜
214 U-00027FFF) of ISO/IEC 10646-2:2001
216 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-5.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
219 <dd>CJK Unified Ideographs Extension B [part 5] (U-00028000 〜
220 U-00029FFF) of ISO/IEC 10646-2:2001
222 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Ext-B-6.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
225 <dd>CJK Unified Ideographs Extension B [part 6] (U-0002A000 〜
226 U-0002A6D6) of ISO/IEC 10646-2:2001
228 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-UCS-Compat-Supplement.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
229 IDS-UCS-Compat-Supplement.txt
231 <dd>CJK Compatibility Ideographs Supplement (U-0002F800 〜
232 U-0002FA1D) of ISO/IEC 10646-2:2001
234 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-01.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
237 <dd>Morohashi: Daikanwa Jiten, Volume 1
239 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-02.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
242 <dd>Morohashi: Daikanwa Jiten, Volume 2
244 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-03.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
247 <dd>Morohashi: Daikanwa Jiten, Volume 3
249 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-04.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
252 <dd>Morohashi: Daikanwa Jiten, Volume 4
254 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-05.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
257 <dd>Morohashi: Daikanwa Jiten, Volume 5
259 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-06.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
262 <dd>Morohashi: Daikanwa Jiten, Volume 6
264 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-07.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
267 <dd>Morohashi: Daikanwa Jiten, Volume 7
269 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-08.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
272 <dd>Morohashi: Daikanwa Jiten, Volume 8
274 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-09.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
277 <dd>Morohashi: Daikanwa Jiten, Volume 9
279 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-10.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
282 <dd>Morohashi: Daikanwa Jiten, Volume 10
284 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-11.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
287 <dd>Morohashi: Daikanwa Jiten, Volume 11
289 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-12.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
292 <dd>Morohashi: Daikanwa Jiten, Volume 12
294 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-dx.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
297 <dd>Morohashi: Daikanwa Jiten, Additions
299 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-Daikanwa-ho.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
302 <dd>Morohashi: Daikanwa Jiten, Appendix
304 href="http://cvs.m17n.org/cgi-bin/viewcvs/*checkout*/ids/IDS-CBETA.txt?rev=HEAD&cvsroot=chise&content-type=text/plain">
307 <dd>Characters encountered by the <a href="http://www.cbeta.org/">Chinese Buddhist Electronic Text
308 Association (CBETA)</a>
313 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">Koichi KAMICHI</a>
314 (<a href="http://www.fonts.jp/">
315 Forum for development of on-the-fly generation of Kanji Fonts
317 <a href="http://www.fonts.jp/search.html">
318 Analytic tool for Kanji Fonts (in Japanese)
322 <h3><a name="glyph">Intgegration and Composition of Character Glyphs
323 and Styles</a></h3> <p> In the character database is information about
324 character glyphs and styles collected. This allows to use this
325 information together with the other knowledge about a character in the
326 database to built a system that uses the <a href="#ids">component
327 structure information </a> to assemble the font for a character
328 depending on the contextual requirements from its components. With
329 this system, occurrences of mismatches based on erroneous association
330 or insufficient contextual information are excluded, and it will be
331 possible easily display and print character forms that have not been codified and for
332 which no fonts exists .
335 <a href="http://www.fonts.jp/">
336 Forum for development of on-the-fly generation of Kanji Fonts
341 <h3><a name="network">Mathematical analysis and visualation of
342 character knowledge</a></h3>
344 <li>Yoshi Fujiwara, Yasuhiro Suzuki, Tomohiko
346 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/nw.ps">
347 Network of Words</a>”, <a href="http://arob.cc.oita-u.ac.jp/">
348 Artificial Life and Robotics 2002</a>
349 (<a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/index.html">
350 Presentation material
352 <li>Model for the relation of Kanji characters that share a component
355 href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage1.jpg">
357 src="images/mage1-s.jpg"><br>Image 1</a>
359 <a href="http://www2.crl.go.jp/jt/a134/yoshi/pc/kanji/mage2.jpg">
361 src="images/mage2-s.jpg"><br>Image 2</a>
366 <h2>CVS Repository</h2>
368 <a href="http://cvs.m17n.org/cgi-bin/viewcvs/?cvsroot=chise">/cvs/chise</a>
372 <h2>Mailing List</h2>
374 Discussion about the CHISE Project occur in the CHISE-{ja|en} mailing list.
376 Anybody who would like to take part in the discussion about and
377 development of the CHISE Project, has ideas or questions about the
378 implementation or wishes for new features is welcome to join either
379 the English, or the Japanese or both lists.
381 To become a member in the CHISE mailing, send a message to the
385 <dd><a href="mailto:chise-ja-ctl@m17n.org">
386 chise-ja-ctl@m17n.org</a>
389 <dd><a href="mailto:chise-en-ctl@m17n.org">
390 chise-en-ctl@m17n.org</a>
394 <blockquote>subscribe Your Name</blockquote>
395 in the body of the message. You will then receive a conformation
396 message with the line
399 confirm PASSWORD Your Name
400 </blockquote> You will have to reply to this message to become a member.
404 <h2>Papers and Presentations</h2>
406 <li><a href="xemacs/#presentation">
407 About XEmacs UTF-2000</a>
408 <li><a href="#network">About mathematical analysis of Character Information</a>
411 <li><a href="papers/u2k-plan.ja/">
412 “Model and Implementation of a Next Generation Multilingual
413 Processing System”
414 </a> (in Japanese. October 1999)
415 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
416 “Non-system characters in XML documents”, in:
417 <i>The Frontier of Asian Information Processing</i>
418 [Seminar Series of the National Documentation and
419 Information Centers in Humanities] No. 10, November 2000
420 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
421 “The UTF-2000 Project”, in:
423 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-2.pdf">
424 Kanji and Information, No.2</a>, March 2001 (in Japanese)
425 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
426 “CHISE project &emdash; beyond the UTF-2000”,
427 <a href="http://www.m17n.org/m17n2001/">
428 m17n2001: the Fifth International Symposium on Multilingual
429 Information Processing and Open Source Software
431 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA Tomohiko</a>,
432 “A Short Introduction to UTF-2000 Project”,
433 the First TEI Character Set Issues Working Group (October 2001,
434 University of California, Berkeley, USA).
435 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
436 “What is Digitisation?”, in:
438 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
439 Kanji and Information, No.3</a>, October 2001 (in Japanese).
440 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
441 “The meaning of 'beyond character codes'”, in:
443 href="http://www.kanji.zinbun.kyoto-u.ac.jp/publications/kanji-and-info-3.pdf">
444 Kanji and Information, No.3</a>, October 2001 (in Japanese).
445 <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~wittern/">WITTERN, Christian</a>,
446 “Some thoughts on the digitization of Kanji”,
447 <i>Information Technology and the Humanities</i>
448 [Seminar Series of the National Documentation and
449 Information Centers in Humanities] No. 11, November 2001.
450 <li><a href="http://web.sfc.keio.ac.jp/~kamichi/">KAMICHI, Koichi</a>,
451 “Building KAGE (Kanji-font Automatic Generating Engine):
452 The Next Gerenation of Kanji Processing beyond the Character Code Model”
453 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
454 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
455 <li><a href="http://www.ya.sakura.ne.jp/~moro/">MORO, Shigeki</a>,
456 “Software Review: CHISE Project,”
457 in <a href="http://www.jaet.gr.jp/jj/3.html"><i>Journal of Japan Association for
458 East Asian Text Processing (JAET)</i> No. 3</a>, October 2002 (in Japanese).
459 <!-- <li><a href="http://www.kanji.zinbun.kyoto-u.ac.jp/~tomo/">MORIOKA, Tomohiko</a>,
460 <a href="papers/dc2002.pdf">
461 「ポスト文字コード時代の文書処理技術に関する展望」</a>、
463 (全国文献・情報センター人文社会科学学術セミナーシリーズ No.12),
468 <h2><a href="history">History</a></h2>
470 This project was assisted by <a
471 href="http://www.ipa.go.jp/NBP/13nendo/13mito/koubo13.html">IPA Exploratory
472 Software Project, 2001</a>.
477 <b>[<a href="../">Above</a>]</b>
479 <p><img SRC="images/dragon.jpg" height=146 width=198></center>
484 <a href="http://www.kanji.zinbun.kyoto-u.ac.jp/">Documentation and Information Center for Chinese Studies (DICCS)</a>,
485 <a href="http://www.zinbun.kyoto-u.ac.jp/">Institute for Research in the Humanities</a>,
486 <a href="http://www.kyoto-u.ac.jp/">Kyoto University</a>
489 <a href="http://www.m17n.org/">m17n.org (the Organization for Multilingualization)</a>
490 <a href="http://www.aist.go.jp/">(National Institute of Advanced Industrial Science and Technology)</a>
494 <a href="http://www.hanazono.ac.jp/">Hanazono University</a>
497 <a href="http://www.aist.go.jp/">National Institute of Advanced Industrial Science and Technology</a>
500 <a href="http://bioinfo.tmd.ac.jp/">Dept. of Bioinformatics</a>,
501 <a href="http://www.tmd.ac.jp/mri/mri.html">Medical Research Institute</a>,
502 <a href="http://www.tmd.ac.jp/">Tokyo Medical and Dental University</a>
508 Last modified: Mon May 17 02:42:17 JST 2004
510 <a href="http://www.aurora.dti.ne.jp/~zom/Counter/index.html">
512 src="http://mousai.as.wakwak.ne.jp/cgi-bin/counterp.cgi?projects_chise-en.log"
518 <!-- Keep this comment at the end of the file
522 time-stamp-line-limit:40