IPP> MOD - Updated Model sections for internationalization

Tue Oct 14 11:30:32 EDT 1997

Hi Bob,                                        Tuesday (14 October 1997)

I read your note (below) with interest.  I sympathize with your desire
to avoid adding RFC 2184-style "charset/language" tags to the beginning
of all "text" and "name" strings in IPP requests/responses.

It would be REAL helpful if all of the IPP working group members would
take the 20 minutes to read closely and carefully Harald's draft IETF
Policy on Character Sets and Languages.  We could deal with the realm
of acceptable (to the IESG) solutions and not the simpler world we wish
we lived in.

For "text" strings, Harald's draft MANDATES both 'charset' and 'language'
negotiation and tagging permanently.  It SPECIFICALLY precludes a protocol
standard from specifying a default character set OR a default language
in the definition of a parameter or attribute of type "text".  So we
MUST have explicit printer default 'charset' and 'language' attributes.
And due to the requirement for negotiation, we also MUST have explicit
operation attributes for the 'charset' and 'language'.  And if we allow
either a printer object or a job object to have a mixture of 'charset'
or 'language' in their "text" or "name" attributes, then we MUST have
explicit tags in the syntax for "text" and "name" attributes.  If we
don't do this in the drafts of Model and Protocol which we submit to
the IESG, we'll get them back for repair (and we won't finish IPP/1.0
in calendar 1997, as a result).

What we COULD do, is to force ALL of the "text" and "name" attributes
in a request or response to use a SINGLE 'charset' and 'language'.
And require that the 'charset' and 'language' of those attributes be
fixed at job creation time and be ones supported by the target printer
(so that job state reason messages, eg, can be correctly generated by
the printer).

Cheers,
- Ira McDonald (outside consultant at Xerox)

PS - Bob, could you write your messages in a 'narrower' text window?
My mail reader truncates and wraps at 79 columns - lots of your lines
had about 85 columns, so I had to do some 'fixups' to include your note
below.  Thanks.
>--------------------------- Bob's note -------------------------------<
>Date: Mon, 13 Oct 1997 16:03:18 PDT
>From: Robert.Herriot at Eng.Sun.COM (Robert Herriot)
>To: ipp at pwg.org, hastings at cp10.es.xerox.com
>Subject: Re: IPP> MOD - Updated Model sections for internationalization
>
>I have a few comments about Tom's latest document. I have talked with Tom and
>we have general agreement about the suggestions below. I agreed to write them
>up and send them out to the mailing list.
>
>1) I am still bothered by specifying the syntax of text and name according 
>   to RFC 2184, ( charset "'" language "'"). I don't like the idea of
>   text and names having a "syntax".  Also, the restriction of text and
>   name values to those charsets that has ASCII as a subset bothers me
>   -- especially as new systems move toward UCS-2 as the native
>   representation.
>
>   I propose a different solution:
>
>    a) separate the charset/language information from the text/name value and
>       put them into separate values, like a multivalued attribute.
>
>    b) For the vast majority of cases where the charset and language of the
>       text/name
>       attribute is the same as at the operation level, the charset/language
>       value is
>       not needed. So the attribute stays as a single valued attribute, as it
>       was in the July draft.
>
>    c) For those rare cases where the language or charset of an attribute
>       differs from
>       language and charset specified at the operation level, treat the
>       atrribute
>       as a pair of values where the first value has a new type
>       "charset/language"
>       and the second has a type of 'name' or 'text'.  The first value is in
>       US-ASCII
>       and has the syntax of RFC 2184, e.g. "iso-latin-1'de'", and the second
>       value is
>       the text/name in the language and charset specified in the first value
>       with no restriction on the charset for the text/name value.  If
>       either the language or charset field is empty, the value at the
>       operation level is inherited.
>   
>       Note: if we had dictionaries, I would instead propose that text/name
>       values
>       with an overridden charset or language could be a dictionary containing 
>       the
>       language and charset overrides along with the text/name value. But we
>       don't
>       have dictionaries yet.
>
>       Note: alternatively, we could also add a special array type consisting
>       of
>       n triplets type/value-length/value where there would be three values:
>       charset (type "charset", language (type "language") and a text/name
>       value.
>
>    d) This change makes the syntax of all requests and most responses be the
>       same
>       as it was before we added the two quotes syntax to text and names.
>       
>
>2) Why isn't content-charset an optional attribute for the client to send?
>    
>    a) I would prefer that the default for a request and response to be UTF-8
>       if no content-charset parameter is present.  
>    b) Then there is no need for a printer to have a default-content-charset.
>    c) make it a group 1 operation attribute and not an HTTP header. Thus
>       rename it 'attributes-charset'
>
>3)  Why isn't content-natural language an optional attribute for the client to 
>       send?
>
>    a) I would prefer that the default for a request and response to be the
>       server's default natural language.  
>    b) Then there is still a need for a printer to have a
>       default-natural-language.
>    c) make it a group 1 operation attribute and not an HTTP header. Thus
>       rename it 'attributes-natural-language'
>
>4)  If my proposal is accepted, then jobs for most printers are unchanged from 
>    the
>    July draft by this proposal. They have no operation attributes
>    specifying charset or language and the text/name attributes contain
>    their value only and have no information about charset and
>    language. The default charset UTF-8 and the printer's default
>    language are used for all requests and responses. But the mechanism
>    is there for full support of charsets and languages if needed.
>