I should have indicated the urgency of this in the subject line yesterday.
Please respond by today, either way, by noon PDT.
Tom
>Return-Path: <pmp-owner at pwg.org>
>X-Sender: hastings at zazen>Date: Mon, 21 Jul 1997 18:48:35 PDT
>To: pmp at pwg.org>From: Tom Hastings <hastings at cp10.es.xerox.com>
>Subject: FWD: PMP> Revised proposal on definition of OCTET STRING to
> allow superset of ASCII
>Sender: pmp-owner at pwg.org>>I just talked with Chris and she would like to have the PMP indicate
>by e-mail by noon PDT, Tuesday, 7/22, whether to make the change that
>I propose or not (attached).
>>Could you please check with your implementations of the Printer MIB
>to see if they restrict the READ-WRITE objects to US-ASCII, i.e.,
>do they lop of the 8th bit or not. Could you also check to see if
>the read-only objects could have additional characters. We all ready
>have evidence about the practice of the HP 5si (see below). It would
>help to see what other practice exists. In other words,
>my proposal is just to document what implementations are really doing
>(which is my understanding of the IETF process of going from proposed to
>draft).
>>Also allowing the Printer MIB to use UTF-8 allows an implementor to follow
>the recommendation of the IETF IAB in RFC 2130 that UTF-8 be the default
>character set. Leaving the MIB as it is forbids a conforming implementation
>to follow the recommendation to use UTF-8 at all.
>>Thanks,
>Tom
>>>Return-Path: <pmp-owner at pwg.org>
>>X-Sender: hastings at zazen>>Date: Mon, 21 Jul 1997 16:40:53 PDT
>>To: pmp at pwg.org>>From: Tom Hastings <hastings at cp10.es.xerox.com>
>>Subject: PMP> Revised proposal on definition of OCTET STRING to allow
>> superset of ASCII
>>Sender: pmp-owner at pwg.org>>>>I have heard no objections to the main thrust of my suggestion
>>to allow additional characters in code positions 128-255
>>for objects of syntax OCTET STRING, as long as code positions
>>32-126 remained US-ASCII. The discussion has been about
>>prtChannelInformation (which I have removed from this proposal).
>>There has been no objections to changing the new object:
>>prtGeneralPrinterName from DisplayString to OCTET STRING either.
>>>>I assume that silence means acceptance on the main thrust
>>of the proposal????
>>>>However, to be clear, I've simplified the proposal and removed
>>any mention of prtChannelInformation and an re-circulating.
>>>>I've talked with David Kellerman. As a result I have
>>modified my proposal to avoid mentioning prtChannelInformation (fixing
>>that description will be a separate issue). I have also changed
>>the proposal so that any object of type OCTET STRING SHALL use no control
>>codes, unless specifically specified in the DESCRIPTION (this should
>>cover prtChannelInformation, prtGeneralCurrentOperator, and
>>prtGeneralServicePerson which talk about LF).
>>>>I have also talked with Bob Pennecost. The HP 5si allows 8-bit data to
>>be written into read-write OCTET STRING objects. We tried
>>prtGeneralCurrentOpoerator and it accepted 8-bit Windows characters
>>and a windows SNMPC application correctly displayed them. Furthermore,
>>the read-write prtInputMediaName object which the 5si will only accept
>>values that have been previously set by the 5si private MIB using 8-bit
>>characters.
>>>>So we have significant implementation practice of RFC 1759 that is *not*
>>limiting OCTET STRING to US-ASCII (7-bits, code positions 32-126) as specified
>>on page 14, top paragraph. So we need to fix page 14 and add a REFERENCE
>>section.
>>>>>>>>Briefly, the problems with the current Printer MIB draft are:
>>>>1. There are many objects of type OCTET STRING that are restricted to ASCII.
>>But ASCII is not a clearly defined term and existing practice is in
>>conflict with the most likely of the interpretations. Existing practice
>>is to use US-ASCII (ANSI X3.4) in code positions 0-127 and some other
>>coded character set in code positions 128-255. In other words, current
>>practice is to use 8-bit coded character sets in which code positions
>>0 to 127 are US-ASCII. Examples of such sets are: ISO Latin 1, HP Roman 8,
>>UTF-8, JIS X0208-1990 Japanese two byte set in 128-255 with US-ASCII in
>>0-127, GB 2312-1980 Chinese two-byte set in 128-255 with US-ASCII in 0-127.
>>>>>>2. One of the new Printer MIB v2 objects, 'prtGeneralPrinterName' has
>>been given a SYNTAX of 'DisplayString', instead of OCTET STRING
>>which forces NVT ASCII only (code positions 128 to 255 SHALL not be used)
>>instead of 'OCTET STRING' which would give the same capabilities for other
>>sets with US-ASCII as a subset as in 1 above.
>>>>>>3. There isn't a proper Bibliography section to refer to other standards
>>that are needed in order to understand references to terms, such as "ASCII",
>>"NVT ASCII", "Unicode", UTF-8, etc.
>>>>>>>>>>Explanation of the problems with suggested solutions and text.
>>>>1. There is a serious ambiguity in the 02 Printer MIB draft about the many
>>objects of syntax OCTET STRING that are indicated as not being localized.
>>Page 14 describes them:
>>>> Localization is only performed on those strings in the MIB that
>> are explicitly marked as being localized. All other character
>> strings are returned in ASCII.
>>>>There is no reference to what is meant by "ASCII".
>>>>The number of different interpretations of this includes:
>>>>a. ANS X3.4, the ANSI standard in positions 0 to 127, 128 to 255 SHALL NOT be
>>used.
>>>>b. NVT ASCII (RFC 854) in positions 0 to 127, 128 to 255 SHALL NOT be used.
>>NVT ASCII includes the following controls for virtual terminals: NUL (0),
>>LF (10), CR (13), BEL (7), BS (8), HT (9), VT (11), FF (12).
>>>>c. Some think that it is any coded character set in which ASCII is in the
left
>>hand side, i.e., values 0 to 127 decimal and any other one or two octet coded
>>character set is from values 128 to 255, such as ISO 8859-1 (ISO Latin-1),
the
>>Windows default set, HP Roman8, any of the eleven ISO 8859-n sets, UTF-8, JIS
>>X0208, GB2312, etc.
>>>>d. And some think it means any coded character set at all, including Unicode,
>>any national 7-bit set, so that ASCII doesn't even have to be in positions
>0 to
>>127.
>>>>>>>>Suggested solution:
>>>>1. I propose that we clarify the Printer MIB to be interpretation c.
>>I believe that that will also correspond to actual practice of implementing
>RFC
>>1759. For example, any of the ISO 8859-n (Latin 1, etc.) meet this
>>criteria. Also HP's Roman-8 meets this criteria, as does the Windows
>>default 8-bit character set. For Asian markets, they may use either UTF-8
>>which is a tranformation of ISO 10646 (Unicode) that meets this criteria
>>or they may use US ASCII in code points 0 to 127 and their national two byte
>>coded character sets in code points 128 to 255 according to the code structure
>>of ISO 2022 for 8 bit environments.
>>>>So replace the second sentence of the paragraph on page 14:
>>>> All other character strings are returned in ASCII.
>>>>with:
>>>> The agent SHALL return all other character strings as coded
>> character sets in which code positions 0-127 (decimal) are
>> US-ASCII [US-ASCII] and the remaining values, 128-255, may be any other
>> coded character set, including multi-byte sets according to ISO 2022
>> [ISO 2022] in 8-bit environments. Examples of
>> coded character sets which meet this criteria are: US-ASCII,
>> ISO 646:1991 IRV [ISO 646], ISO 8859-1 (Latin-1) [ISO 8859],
>> any ISO 8859-n, HP Roman8, Windows Default 8-bit set, UTF-8 [UTF-8],
>> US-ASCII plus JIS X0208-1990 Japanese [JIS X0208], GB2312-1980 Chinese
>> [GB2312].
>>>> Examples of coded character sets which do not meet this criteria are:
>> national 7-bit sets (except US ASCII), EBCDIC, and ISO 10646 (Unicode)
>> [IS 10646]. In order to represent Unicode characters, use UTF-8.
>>>> Control codes (code positions 0-31 and 127) SHALL NOT be used unless
>> specifically specified in the DESCRIPTION of the object.
>>>>>>2. Change the syntax of the MIB object: 'prtGeneralPrinterName'
>> from 'DisplayString' which is restricted to US-ASCII to OCTET STRING,
>> so that other sets may be used in code positions 128 to 255 and so that
>> the restricted set of controls will be specified.
>>>>>>3. Add a proper Bibliography section so that the above references
>>can be made. I found a proper reference to US-ASCII in RFC 2044
>>(UTF-8) as:
>>>> [US-ASCII] Coded Character Set--7-bit American Standard Code for
>> Information Interchange, ANSI X3.4-1986.
>>>>So it is ok to refer to ANSI standards from IETF standards.
>>>>>>>>So I propose that the Bibligraphy section be:
>>>> [US-ASCII] Coded Character Set - 7-bit American Standard Code for
>> Information Interchange, ANSI X3.4-1986.
>>>> [ISO 646] ISO 646:1991, "Information technology - ISO 7-bit coded
>> character set for information interchange".
>>>> [ISO 8859] ISO 8859-1:1987, "Information technology - 8-bit single
>> byte coded graphic character sets -
>> Part 1: Latin alplhabet No. 1"
>>>> [ISO 2022] ISO 2022:1994 - "Information technology - Character code
>> structure and extension techniques"
>>>> [ISO 10646] ISO 10646-1:1993, "Information technology - Universal
>> Multiple-Octet Coded Character Set (UCS) - Part 1:
>> Architecture and Basic Multilingual Plane
>>>> [UTF-7] Goldsmith, D., and M. Davis, "UTF-7", RFC1642, Taligent,
>> Inc., July 1994.
>>>> [UTF-8] F. Yergeau, "UTF-8, a transformation format of Unicode
>> and ISO 10646", RFC 2044, October 1996.
>>>> [NVT ASCII] J. Postel, J. Reynolds, "TELENET PROTOCOL SPECIFICATION",
>> RFC 854, May 1983.
>>>> [JIS X0208] JIS X0208-1990, "Japanese two byte coded character set."
>>>> [GB2312] GB 2312-1980, "Chinese People's Republic oF China (PRC)
>> mized one byte and two byte coded character set"
>>>>>>>>>>>>For reference:
>>I've extracted all objects of type OCTET STRING from the draft 02.
>>I've put "localized" in front of the ones whose DESCRIPTIONs say are
>>localized according to prtGeneralCurrentLocalization and concole
>>localization in front of the ones whose DESCRIPTIONs say are localized by
>>prtConsoleLocalization:
>>>> prtGeneralCurrentOperator OCTET STRING,
>> prtGeneralServicePerson OCTET STRING,
>> prtGeneralSerialNumber OCTET STRING,
>>localized prtCoverDescription OCTET STRING,
>> prtCoverDescription OCTET STRING,
>> prtLocalizationLanguage OCTET STRING,
>> prtLocalizationCountry OCTET STRING,
>> prtInputMediaName OCTET STRING,
>> prtInputName OCTET STRING,
>> prtInputVendorName OCTET STRING,
>> prtInputModel OCTET STRING,
>> prtInputVersion OCTET STRING,
>> prtInputSerialNumber OCTET STRING,
>>localized prtInputDescription OCTET STRING,
>> prtInputMediaType OCTET STRING,
>> prtInputMediaColor OCTET STRING,
>> prtOutputName OCTET STRING,
>> prtOutputVendorName OCTET STRING,
>> prtOutputModel OCTET STRING,
>> prtOutputVersion OCTET STRING,
>> prtOutputSerialNumber OCTET STRING,
>>localized prtOutputDescription OCTET STRING,
>>localized prtMarkerSuppliesDescription OCTET STRING,
>> prtMarkerColorantValue OCTET STRING,
>>localized prtMediaPathDescription OCTET STRING,
>> prtChannelProtocolVersion OCTET STRING,
>> prtInterpreterLangLevel OCTET STRING,
>> prtInterpreterLangVersion OCTET STRING,
>>localized prtInterpreterDescription OCTET STRING,
>> prtInterpreterVersion OCTET STRING,
>>console localization prtConsoleDisplayBufferText OCTET STRING
>>console localization prtConsoleDescription OCTET STRING
>>localized prtAlertDescription OCTET STRING,
>>>>>>We want to add to the above list:
>>>> prtGeneralPrinterName OCTET STRING
>> prtChannelInformation OCTET STRING
>>>>>>>>>>>