In order to properly evaluate the impact on applications of alternative
4 versus 5, we first need to know which objects most applications
are just going to pass through to their host platform for (1) display
or (2) to pass to some email, phone, or page utility
and which ones the application will actually do some processing on, such
as comparing with a know list of values.
a. If the application just passes all OCTET STRING objects to its host platform
then, as long as the host platform accepts coded characters in the
same character set and encoding scheme as the application received
from the MIB, the application has little to do. The application just
passes the data through. I don't know whether any host platform will
accept UTF-8 directly. Most accept directly the sets we've listed
in altenative 4, except NOT UTF-8, but including UNICODE (NT).
b. If the host platform has a different coded character set or encoding scheme,
then the application has to perform code conversion, usually using the
code conversion facilities offerred by the host platform. POSIX has
a bunch of code conversion facilities. I don't know whether NT has.
I would hope that NT has a UTF-8 to Unicode conversion though.
c. But if the application has to process the value, for example, compare for
a particular value, then the application has to make sure that the two
things being compared are in the same code set. With alternative 5,
the application would always store significant values in UTF-8 and would
avoid the conversion. With alternative 4, the application either converts
the data from the MIB to the application's perferred cannonical char set
or has a catalog of significant values in different coded character sets
and choses the right catalog depending on what char set the MIB data is in.
So the bottom line for us is to analyze the 25 objects and see which
an application might do more than just pass along to the host platform?
Could some one put leading * on the (few) objects that they think an
application might actually process and so would care what the coded
character set is from the MIB. I couldn't find any that I was sure
an application might want to process.
ISSUE:
The XxxSerialNumber objects an application might be comparing with
a list. But I wonder if we could agree to constrain serial numbers
to US-ASCII? Then they would not be subject to this discussion.
I think we have to leave XxxVersion as open to non-ASCII characters, but
I don't see applications comparing on them, or would they?
The PostScript version number is just a US-ASCII '2'.
I've extracted all objects of type OCTET STRING from the MIB draft 02.
I've put 'localized' in front of the ones whose DESCRIPTIONs say are
localized according to prtGeneralCurrentLocalization and 'console
localization' in front of the ones whose DESCRIPTIONs say are localized by
prtConsoleLocalization:
I've put RW for read-write objects and R for read-only objects.
NOTE that an implementation is NOT required to make any of the
RW objects writeable.
[The application is likely to want to parse the following two
looking for the keywords: 'phone: ', 'fax: ', and 'email: '.
Fortunately, these keywords are constrained to be US-ASCII, so the
application can pick of the values and pass them on without further
processing. So alternative 4 vs 5 is the same for these two as far
as processing goes - TH]
RW prtGeneralCurrentOperator OCTET STRING,
RW prtGeneralServicePerson OCTET STRING,
RW prtGeneralSerialNumber OCTET STRING,
R localized prtCoverDescription OCTET STRING,
The following two objects are proposed to be changed to DisplayString
since the ISO 639 and 3166standards specify what the values shall be using
Latin letters only:
R prtLocalizationLanguage OCTET STRING,
R prtLocalizationCountry OCTET STRING,
RW prtInputMediaName OCTET STRING,
RW prtInputName OCTET STRING,
R prtInputVendorName OCTET STRING,
R prtInputModel OCTET STRING,
R prtInputVersion OCTET STRING,
R prtInputSerialNumber OCTET STRING,
R localized prtInputDescription OCTET STRING,
RW prtInputMediaType OCTET STRING,
RW prtInputMediaColor OCTET STRING,
RW prtOutputName OCTET STRING,
R prtOutputVendorName OCTET STRING,
R prtOutputModel OCTET STRING,
R prtOutputVersion OCTET STRING,
R prtOutputSerialNumber OCTET STRING,
R localized prtOutputDescription OCTET STRING,
R localized prtMarkerSuppliesDescription OCTET STRING,
R prtMarkerColorantValue OCTET STRING,
R localized prtMediaPathDescription OCTET STRING,
R prtChannelProtocolVersion OCTET STRING,
R prtInterpreterLangLevel OCTET STRING,
R prtInterpreterLangVersion OCTET STRING,
R localized prtInterpreterDescription OCTET STRING,
R prtInterpreterVersion OCTET STRING,
RW console localization prtConsoleDisplayBufferText OCTET STRING
R console localization prtConsoleDescription OCTET STRING
R localized prtAlertDescription OCTET STRING,
This proposal would add to the above list:
RW prtGeneralPrinterName OCTET STRING
NOTE: David Kellerman's revised prtChannelInformation changes the
syntax of prtChannelInformation from DisplayString to OCTET STRING, but
his revised DESCRIPTION states that the coded character set after the
"=" sign is determined by the value of the enum. Therefore,
prtChannelInformation is NOT affected by the code set specified
with alternative 4 or 5 and should NOT be in the list of objects
affected.:
R prtChannelInformation OCTET STRING