Hi folks,
Per my action item from today's Character Repertoires
telecon, some useful links.
Cheers,
- Ira McDonald
High North Inc
------------------------------------------------------
The best introduction and in-depth paper on UTF-8
and Unicode, from Markus Kuhn (a Unicode guru),
last updated 19 November 2002 (a living document):
http://www.cl.cam.ac.uk/~mgk25/unicode.html
IBM's open source ICU (I18N Components for Unicode)
has a wonderful set of charset maps (they're in
Unicode _to_ Legacy layout, which is the opposite of
the ISO-8859 maps at the Unicode, which are in Legacy
_to_ Unicode layout). The top-level URL:
http://www-124.ibm.com/cvs/icu/charset/data/
For official ISO and IETF language, country, and
script codes visit Michael Everson's page (Michael
is the IETF language tag reviewer and a heavy in
ISO language standards and the Unicode Consortium):
http://www.evertype.com/internationalization.html
The complete up-to-date Unicode Character Database:
The Unicode Unihan Asian database:
http://www.unicode.org/Public/UNIDATA/Unihan.zip (5.0MB)
http://www.unicode.org/Public/UNIDATA/Unihan.txt (25.1 MB)
The Unicode Consortium maintained mapping tables:
http://www.unicode.org/Public/MAPPINGS/
And augmenting the mapping tables available from
the Unicode site is the CSets page:
http://crl.nmsu.edu/~mleisher/csets.html
This archive was generated by hypermail 2b29 : Wed Dec 11 2002 - 19:06:25 EST