After talking with Larry Masinter yesterday, I WITHDRAW my suggestion
that IPP's 'document-format' attribute be an extended form of a MIME
'media-type' (used in 'Content-Type' headers), with an added 'language'
parameter.
Larry argues that this fosters incoherence (in IETF standard protocols)
and forces an IPP Printer (ie, server application) to sometimes PARSE
'document-format', in order to construct MIME headers for 'Content-Type'
and 'Content-Language' (thus 'document-format' would NOT be opaque to
the IPP server application - this is not good).
Instead, I suggest we have two MANDATORY attributes for job operations
(and the Job Monitoring MIB):
1) 'document-format'
- value is 'media-type' (with 'charset' for 'text/*' types)
- maps one-to-one to MIME 'Content-Type' header
2) 'document-language'
- value is an RFC 1766 compliant language tag
- maps one-to-one to MIME 'Content-Language' header
There remains one apparent problem with using MIME 'media-types' (see
RFC 2046) for IPP 'document-format' - their possible limitation (see
RFC 2046, section 4.1.2 'Charset Parameter', page 7) to the use of ONLY
US-ASCII (7-bit) or ISO-8859-X (8-bit) character sets.
Support for UTF-8 (RFC 2044, IANA registered character set type for ISO
10646 folded into a multi-octet 8-bit superset of US-ASCII, is critical
for IPP documents. Support for ALL of the IANA registered character set
types is highly desirable (and coherent with the revised ABNF for MIME
parameter VALUES specified in RFC 2184).
Larry, can you comment on character sets for 'media-types' and hopefully
clarify this for us?
Cheers,
- Ira McDonald (outside consultant at Xerox)
High North Inc
906-494-2434
PS - A compelling reason for language tags on all text, stated in RFC
2184 on page 2, is to facilitate text 'reader' software for blind people
(knowing the 'charset' is NOT sufficient to 'read' text aloud). This is
an ethical consideration of the highest importance.
PPS - Note that a document which contains ONLY graphics and NO text does
not need (or benefit from) 'document-language', but that ANY document
which contains text (no matter what the 'media-type') benefits strongly
from 'document-language' (because the IPP server application need not
parse the document itself to discover imbedded language tags to behave
properly).