Great job Roger! Your write-up addresses a lot of important issues.
I have the following comments.
Page 2:
Your description of the difference between the way information is
passed into a CGI script for GET and POST makes me wonder if
we shouldn't use POST for GetAttributes and GetJobs too.
Page 6:
You got my idea pretty much right. The only difference I see is where
you have 'content-type:text/plain' for the job attributes, I suggested
'content-type:application/ippjob' to make it clear that there is
special stuff there. This difference also means that if one entity body
in a multipart/mixed has a content-type of 'application/ippjob', then a
server (cgi script) can assume that the other entity-bodies are
documents. We might even require that the first entity-body be
application/ippjob to simplify processing.
Page 6:
At the bottom of the page you state that the two problems of boundary
and order of entity bodies could be show stoppers. You are also
concerned about adding document attributes.
I will discuss why boundaries aren't a problem below. But first with
regard to ordering, it should be fairly easy to define that servers
(i.e. cgi-scripts) shall assume that the first entity body of the
multipart/mixed is of type application/ippjob and that all subsequent
entities are documents in the order that a user submitted them.
Furthermore a document may be just a document or it may be structured.
If it is a document, its content-type is text/plain,
application/PostScript or whatever. If it is structured, its
content-type is multipart/mixed with a first entity body whose
content-type is application/ippdocument and a second entity body
whose content-type is a document, such as text/plain or
application/PostScript.
There are certainly other arrangements where the
application/ippdocument is at the same level as application/ippjob and
immediately precedes the document that it modifies. In this case the
entities for a 3 PostScript document job with overrides in the second
document would be: application/ippjob, application/PostScript,
application/ippdocument, application/PostScript,
application/PostScript.
You state that construction of boundaries require prescanning a
document. Prescanning is one solution, but as the quote from rfc 1521
below shows, this is not necessarily the expected solution. And from
samples I have seen, programs generate unique boundary-id via an
algorithm that does not require scanning of the contents.
NOTE: Because encapsulation boundaries must not appear in the body
parts being encapsulated, a user agent must exercise care to
choose a unique boundary. The boundary in the example above could
have been the result of an algorithm designed to produce
boundaries with a very low probability of already existing in the
Borenstein & Freed [Page 31]
RFC 1521 MIME September 1993
data to be encapsulated without having to prescan the data.
Alternate algorithms might result in more 'readable' boundaries
for a recipient with an old user agent, but would require more
attention to the possibility that the boundary might appear in the
encapsulated part. The simplest boundary possible is something
like "---", with a closing boundary of "-----".
Page 8:
I still think that my proposal is the most viable in the HTTP context.
You show Content-length and Document-length fields. These fields are
nice in some circumstances, but they create a problem of receiving
documents via stdin where the length is unknown.