Minutes:Character Repertoires working group meeting, January 21, 2003
Minutes of the March 31, 2003 meeting of the PWG Character Repertoires Committee
Elliott Bradshaw, 4/1/03
Attendees
Lee Farrell, Canon
Ira McDonald, High North
Alan Berkema, HP
Harry Lewis, IBM
Dennis Carney, IBM
Jerry Thrasher, Lexmark
Bill Wagner, NetSilicon
Elliott Bradshaw, Oak Technology
General Discussion
Our main agenda item was the review of the March 17 version of the Character
Repertoire Interoperability document.
It is clear that the overall intent and scope of the document are still
difficult to digest for the first time reader. This may suggest we need
better explanations, or it may suggest the goals and scope are flawed.
There was consensus that it is useful to agree on a syntax for referring to
existing character repertoires, as discussed in 3.1; and that it is useful
to add this to SM and IPP.
There was more discussion and, perhaps, discomfort, about the material that
attempts to prescribe required advertisement and support under various
circumstances; i.e. the concept of a Basic Repertoire. For one
thing, the circumstances (e.g. "supporting similar repertoires") are
hard to lock down in a definitive way. For another, it is hard to pick a
list of Basic Repertoires and manage its growth in a productive way. For
example, we would like to expand the scheme to other countries without requiring
a major document revision cycle.
One possible approach is to move some of this material into a "Best
Practices" category, in which there would be no firm requirement but
general guidelines for how to proceed with as much interoperability as possible.
These discussion ideas are captured in the Issues, below.
Decisions
We made the following decisions regarding changes to the document:
Emphasize that support for a character does not imply WYSIWYG...any mapped
version will do.
State that support for a character means that the printer knows how to
render it, through any combination of local and downloaded-as-needed font
data.
Use Semantic Model naming conventions and syntax throughout. E.g.
"repertoires-supported" should be "RepertoireSupported"
(singular, not plural, for consistency with other SM Supported elements).
Remove "repertoires-ready", as the group didn't feel there was a
strong use case for this.
In Section 3, separate out the SM-specific material into a SM mapping
section near the end; this will be short and descriptive (and, I suggest,
informative). The description of values should be independent of SM
(while freely borrowing SM syntax conventions where needed).
Put a similar IPP mapping section near the end of section 3; remove
appendices for SM and IPP mapping.
In section 3.1 clarify that we are defining a "textual
namespace" with our prefix conventions.
The mapping rules were approved as written, and there is no need for a
canonical form.
The conformance section should be a summary of previous material.
Make sure each item (e.g. requirement for euro) is addressed separately.
Amend the Unicode references to lock down the current version (3.2).
Issues
Should we add diagrams, use cases, or other explanatory material to the
Introduction?
Should the rules for Basic Repertoires be normative, or a best
practice? Should they be in a separate section or
document?
Specifically, when a printer advertises a repertoire, must it guarantee to
render every character, or is it best effort?
It was suggested that in addition to RepertoireSupported, we allow
Repertoire (or Repertoires?) to be submitted with a job. This would warn the printer
that the job needed characters from those repertoires; if the printer
had specific knowledge that it could not handle them, then it could fail the
job. (But see issue #2.)
Currently the list of basic repertoires uses certain Unicode charts.
Alternatively, should it simply include all Unicode charts? This would
provide minimal interoperability for a wider range of circumstances, but it
does extend the obligations of a printer, since it must support every
character in each of the repertoires it advertises.
Stake in the ground: Best Practices
For discussion, here is a possible approach.
We create a new section called Best Practices, which I suppose is marked
Informative. The purpose of the Best Practices is to improve
interoperability, including with lightweight clients.
The syntax for repertoire names (i.e. the name space prefix notation)
remains normative. For normative purposes, support of a repertoire
means the printer can render most of its characters most of the time.
The matching rules are normative, as this defines what "equal"
means for repertoire names.
Basic repertoires are:
All code charts of the form Unicode: chart
Unihan: GB 2312
Unihan: JIS X 0208
Unihan: KS X 1001:1992
Unihan: Big5
Best Practices include the following:
For each supported repertoire, be able to render all of its characters
regardless of which font is current. [Aggressive substitution and
non-WYSIWYG is fine.]
Always support and advertise Unicode: Basic Latin and Unicode: Latin-1
Supplement.
Always support the euro character.
Always support characters called for by a particular formatting
language, e.g. predefined character entities in XHTML.
[This is the fuzzy part] Support and advertise a basic repertoire
whenever supporting a similar one. E.g. when supporting IANA:
ISO-8859-7, also support Unicode: Greek.
When a client submits a job with the Repertoire element specified,
examine the values in the element. If there is any element which
a) the printer recognizes, and b) the printer knows it does not support
most of its characters, then fail the job as defined in the current
protocol in use.