IPP Mail Archive: Re: IPP> MOD - Separate 'document-format' and 'document-language'

Re: IPP> MOD - Separate 'document-format' and 'document-language'

Ira Mcdonald x10962 (imcdonal@eso.mc.xerox.com)
Tue, 30 Sep 1997 17:02:24 PDT

Hi Ned,

Thanks for the quick response. Actually I did re-read that section
of RFC 2046 and the short list (US-ASCII and ISO-8859-X) were
termed the 'Internet standard character sets'. It also says a
little later, 'this standard does NOT endorse the use of any
character set other than US-ASCII'.

I'm aware that UTF-8 is registered in the IANA character set
registry. What I couldn't find was an unambiguous statement
in RFC 2046 (or an updating RFC) that ANY IANA registered
character set MAY be specified in a 'charset' parameter of
a MIME 'media-type'. Can you point at such a statement, to
help us all out?

Cheers,
- Ira McDonald
------------------------- Ned's note --------------------------------
Return-Path: <Ned.Freed@innosoft.com>
Received: from zombi (zombi.eso.mc.xerox.com) by snorkel.eso.mc.xerox.com (4.1/XeroxClient-1.1)
id AA14548; Tue, 30 Sep 97 13:07:31 EDT
Received: from alpha.xerox.com by zombi (4.1/SMI-4.1)
id AA07025; Tue, 30 Sep 97 13:03:31 EDT
Received: from THOR.INNOSOFT.COM ([192.160.253.66]) by alpha.xerox.com with SMTP id <52232(4)>; Tue, 30 Sep 1997 10:03:26 PDT
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.1-10 #8694)
id <01IO7JHY9WM894GL1L@INNOSOFT.COM> for imcdonal@eso.mc.xerox.com; Tue,
30 Sep 1997 10:01:22 PDT
Date: Tue, 30 Sep 1997 09:35:50 PDT
From: Ned Freed <Ned.Freed@innosoft.com>
Subject: Re: IPP> MOD - Separate 'document-format' and 'document-language'
In-Reply-To: "Your message dated Tue, 30 Sep 1997 06:50:29 -0700 (PDT)"
<9709301350.AA14192@snorkel.eso.mc.xerox.com>
To: imcdonal@eso.mc.xerox.com
Cc: ipp@pwg.org
Message-Id: <01IO902DAVIO94GL1L@INNOSOFT.COM>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Status: R

> After talking with Larry Masinter yesterday, I WITHDRAW my suggestion
> that IPP's 'document-format' attribute be an extended form of a MIME
> 'media-type' (used in 'Content-Type' headers), with an added 'language'
> parameter.

> Larry argues that this fosters incoherence (in IETF standard protocols)
> and forces an IPP Printer (ie, server application) to sometimes PARSE
> 'document-format', in order to construct MIME headers for 'Content-Type'
> and 'Content-Language' (thus 'document-format' would NOT be opaque to
> the IPP server application - this is not good).

> Instead, I suggest we have two MANDATORY attributes for job operations
> (and the Job Monitoring MIB):

> 1) 'document-format'
> - value is 'media-type' (with 'charset' for 'text/*' types)
> - maps one-to-one to MIME 'Content-Type' header

> 2) 'document-language'
> - value is an RFC 1766 compliant language tag
> - maps one-to-one to MIME 'Content-Language' header

> There remains one apparent problem with using MIME 'media-types' (see
> RFC 2046) for IPP 'document-format' - their possible limitation (see
> RFC 2046, section 4.1.2 'Charset Parameter', page 7) to the use of ONLY
> US-ASCII (7-bit) or ISO-8859-X (8-bit) character sets.

Such a restriction only exists in your imagination, I'm afraid. You need to
reread the section you cited. In particular, you should note that the list of
charsets it specifies is an *initial* list. Many other charsets can, and have
been, registered. Over 200 of them, as a matter of fact.

Now, there are a fair number of problems with our current charset registration
procedures, but lack of registered charsets definitely isn't one of them.

> Support for UTF-8 (RFC 2044, IANA registered character set type for ISO
> 10646 folded into a multi-octet 8-bit superset of US-ASCII, is critical
> for IPP documents.

UTF-8 is already registered and hence it is entirely legal to use it
in MIME. See:

ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets

> Support for ALL of the IANA registered character set
> types is highly desirable (and coherent with the revised ABNF for MIME
> parameter VALUES specified in RFC 2184).

This issue is being addressed in the new charset registration procedures. See
draft-freed-charset-reg-03.txt (soon to be -04) for specifics.

Ned

Our website uses cookies on your device to give you the best user experience. By using our website, you agree to the placement of these cookies. To learn more, read our privacy policy. Read Privacy Policy