Tom,
Excellent summation of our current alternatives. I also agree with
you that we should take the time to resolve this in a fully informed
manner, and not just sweep it under the carpet.
One question, though. In alternatives #2-#5, you don't explicitly
denote how one handles byte values in the 0-31 value range. Care
to add some comments about that? That way we'd have a fully specified
value range to consider. (This may be a silly question, but as you
note, silly questions are allowed... ;-)
...jay
----- Begin Included Message -----
Date: Thu, 24 Jul 1997 10:50:27 PDT
To: lpyoung at lexmark.com
From: Tom Hastings <hastings at cp10.es.xerox.com>
Subject: Re: PMP> Localization conclusion - prtGeneralPrinterName
Cc: pmp at pwg.org
I disagree.
We are getting close to an agreement on removing the ambiguity
of the char set (while NOT attempting to solve the much harder
problem of localization that includes language and country).
We have five alternatives proposed:
1. Leave the document as it is and leave "ASCII" as ambiguous.
2. Leave the document as it is, but at least add a reference that "ASCII"
means US-ASCII in 32-126 and that 128 to 255 SHALL NOT be used and
add a proper reference.
3. Allow any graphic characters in 128 to 255, but 32-126 SHALL be US-ASCII
but provide no way for an application to determine which character set
128 to 255 is representing. (My Tuesday proposal).
4. Allow any graphic characters in 128 to 255, but 32-126 SHALL be US-ASCII
AND provide a new object to say what that code set is being used in 128
to 255. (My Wednesday SYNTHESIS proposal).
5. Allow only UTF-8 (which is US-ASCII in 32-126) and multi-byte character
encoding scheme in 128 to 255 that represent the ISO 10646 coded
character set. (David Kellerman's proposal).
If we can't take the time to evaluate these proposal with pros and cons
TODAY and make a choice, we aren't doing our job as a technical committee.
I'll attempt a list or pros and cons of each. I believe that there are
still a lot of people who do not even understand the 5 alternatives
and how they impact current and future products. In order to make
an informed decision, we need to understand them.
Any one who has a question about any of the aleternative, please do NOT
hesitate to ask. There are no silly questions about this. I've worked
in coded character sets for twenty years (in cluding being chairman
of the US-ASCII committee and working on ISO 8859 and ISO 10646), but
most of us are still learning.
One reading of RFC 2130 does not make you a coding set expert, even though
that RFC is a very well-written and technically sound document. Witness
Michael Kirkham mis-uderstandings of UTF-8 vs ASCII after reading RFC 2130
and David Kellerman's good response to him of what UTF-8 is.
Tom
At 08:15 07/24/97 PDT, lpyoung at lexmark.com wrote:
>>Chris and I are bringing the localization discussion to
>conclusion. There have been some side proposals that have
>come up from time to time, I wanted to separate these out
>to see if we have consensus on these proposed changes. One
>of the side proposals was to change the syntax of the
>prtGeneralPrinterName from DisplayString to OCTET STRING.
>If we want to make this change, I would propose the size
>be (0 to 63). I have checked with our networking people
>and this size covers all operating systems that we are
>aware of.
>>I know most of us are tried of reading about localization
>and answering questions about localization. The only
>answer I want back from this note is "I agree" or
>"I disagree". Please leave the subject line as stated so
>I can easily count the votes.
>Thanks,
>Lloyd
>------------------------------------------------------------
>Lloyd Young Lexmark International, Inc.
>Senior Program Manager Dept. C14L/Bldg. 035-3
>Strategic Alliances 740 New Circle Road NW
>internet: lpyoung at lexmark.com Lexington, KY 40550
>Phone: (606) 232-5150 Fax: (606) 232-6740
>>>>
----- End Included Message -----