Hi Elliot,
Yes - GB18030 is a mapping to EVERY codepoint in Unicode (not just
the assigned ones, but all 1.1 million possible Unicode codepoints).
But it's a multi-byte, variable-length (one to four bytes) set of
codepoints in GB18030.
As Markus Scherer says it is best thought of as a Chinese-market
UTF (Unicode Transformation Format), like UTF-8, UTF-16, and UTF-32.
I agree with you therefore, that PWG CR should view GB18030 as a
valid 'charset' (which can be tagged) but NOT as a unique
'repertoire' (because it's a different encoding of Unicode).
Cheers,
- Ira McDonald
High North Inc
-----Original Message-----
From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com]
Sent: Monday, March 03, 2003 11:32 AM
To: McDonald, Ira
Cc: 'cr@pwg.org'; owner-cr@pwg.org
Subject: Re: CR> FW: GB 18030 Information Required
Interesting.
If I read this correctly, then 18030 is a mapping to ALL of Unicode. This
would make it an encoding, but not a subset.
If that's right, then we would treat it as a kind of charset, but not as a
repertoire.
Your thoughts?
E.
------------------------------------------
Elliott Bradshaw
Director, Software Engineering
Oak Technology Imaging Group
781 638-7534
"McDonald, Ira"
<imcdonald@shar To: "'cr@pwg.org'"
<cr@pwg.org>
plabs.com> cc:
Sent by: Subject: CR> FW: GB 18030
Information Required
owner-cr@pwg.or
g
03/03/2003
11:42 AM
Hi folks,
Elliot - the first two white papers (links below) look highly
useful. Markus Scherer is a Unicode and charsets heavy at IBM.
Cheers,
- Ira McDonald
High North Inc
-----Original Message-----
From: Markus Scherer [mailto:markus.scherer@jtcsv.com]
Sent: Monday, March 03, 2003 10:26 AM
To: vinay.aggarwal@rebus.co.in; charsets
Subject: Re: GB 18030 Information Required
vinay.aggarwal@rebus.co.in wrote:
> Could you please let me know if following supports the GB18030?
> - Any web based application
> - Browser (Internet Explorer/ Netsacpe) based application
Yes and no. Generally, web-based applications and browsers and related
protocols do support GB 18030
and Unicode and various other charsets.
Specifically, you need to read about
- charsets, e.g.,
http://oss.software.ibm.com/icu/docs/papers/codepages_and_unicode.html
- GB 18030, e.g., http://oss.software.ibm.com/icu/docs/papers/gb18030.html
- Unicode, e.g., http://www.unicode.org/standard/WhatIsUnicode.html
and about the particular applications (and versions of them) that you
intend
to use.
markus
This archive was generated by hypermail 2b29 : Mon Mar 03 2003 - 16:39:26 EST