IPP Mail Archive: IPP> MOD - Rationale for why we dropped "document-charset" attribute

IPP> MOD - Rationale for why we dropped "document-charset" attribute

Tom Hastings (hastings@cp10.es.xerox.com)
Fri, 14 Nov 1997 07:39:03 PST

A Xerox developer asked me why we dropped the "document-charset"
operation attribute from the IPP Model.

This is my reply, which I'm passing on to the rest of you.
My reply also indicates something for us to do when we register
the Printer MIB interpreter family enums as (MIME) media types.

Tom

Each (MIME) media type should have a charset parameter, if a charset needs
to be specified in order to interpret the data correctly. For example:

text/plain; charset=utf-8

Maybe we should add some more description to that affect in
Section 4.1.9. The RFC that defines text/plain says that if the
charset parameter is omitted, then the charset SHALL be assumed to
be us-ascii.

In talking with Bob Pentecost about PCL, he verified that PCL drivers
on Windows and Word Perfect do embed the charset escape sequence, so
that there was no problem with dropping the "document-charset" IPP
operation attribute as far as PCL was concerned. Bob also confirmed
that drivers are pretty good at embedding the escape sequence in
the data stream that indicates that the data stream is PCL. Consequently,
we added the parenthetical note to application/vnd.hp-PCL that
the charset escape sequence is embedded in the data.

So there weren't any document-formats that we knew that needed
a charset attribute, instead of the data stream always specifying
the charset or the media type allowed the charset parameter.

When we register the document formats from the Printer MIB as media
MIME types, we need to make sure that the document formats that do
need a charset parameter indicate such and whether the charset parameter
is mandatory or optional. If it is optional, the registration also needs
to specify what charset is assumed when the charset parameter is omitted.

The only problem may be the media type 'applicatation/octet-stream' which
does not allow a charset parameter. So the client can't specify the
document charset, in case the document is auto-detected to be
text/plain, since the application/octet-stream mime type doesn't
allow the charset parameter. Bob didn't think that was a problem for
them, because their auto-detect assumes the document is PCL, if it isn't
PostScript.

Tom