Pete,
Thanks for the correction. A few items for clarification to see if I got it
right. I would very much appreciate either agreement with or correction of
my interpretation.
1. Hardcopy document refers just to the material that is being scanned
and has no other correlation to Digital Document. For example, conceivably,
a set of pages from multiple books could be scanned with the image data
ultimately appearing in a single document. Alternatively, multiple pages for
a single hardcopy document could be scanned and appear as multiple Digital
Documents.
2. Image refers to the content in a specified scan region of a
hardcopy document. Unless the scanner is set up to combine the data from
multiple scanned images, an image cannot refer to more than the content of
one sheet side. Therefore, for example, a scan job concerned with multiple
pages of a hardcopy document will contain multiple images.
3. There is a 1:1 relationship between Digital Documents and files, in
that each Digital Document is formatted and stored as a separatefile.
4. In general, the relationship of images to digital documents is
format (and perhaps implementation) specific. If the
'document-format-accepted' is a document format such as PDF, there may be
multiple images per document. If 'document-format-accepted' is an image
format such as JPEG or GIF, each image is more likely a separate digital
document.
5. In IPP Scan Push mode, all Digital Documents produced in a Job are
scanned and formatted in the same way and stored to the same
destination(s), as specified in the CreateJob.
a. If the job produces multiple digital documents, the
destination is a directory with each Digital Document being a separate file
in that directory.
b. If the job produces a single document, the specified
destination is of the file.
6. In IPP Scan Push mode, the client may specify multiple
destinations.
7. In IPP Scan Pull mode, each digital document produced by a job is
sent back to the client in response to a GetNextDocumentImage request from
the Client. Each document may contain data from a single image or from
multiple images. There may be multiple documents as part of a single job.
a. Unlike in Push mode, the Compression Accepted and Document
Format Accepted may be separately specified in each GetNextDocumentImage
request. (I find this rather odd - especially since each such request does
not necessarily correspond to either a Document or an image. )
b. The mode of operation of GetNextDocumentImage depends upon
whether Wait Mode is agreed upon. In Wait mode, data is sent as it becomes
available and can be accepted. If not in Wait mode (or if Wait mode is
interrupted or timed out) , the client must issue a GetNextDocumentImage for
each buffer's-worth of data.
8. This mode of transfer suggests that GetNextDocumentImage does not
refer either to getting an Image or getting a Document, it just pulls data
in a mode determined by the Wait mode. That data may be formatted into one
or more Digital Documents, depending on format and contents.
I also understand that you suggest that the Scan Service in SM3 be changed
to agree with IPP Scan.
Many thanks,
Bill Wagner
-----Original Message-----
From: Zehler, Peter [mailto:Peter.Zehler at xerox.com]
Sent: Friday, September 26, 2014 9:20 AM
To: William A Wagner; 'Michael Sweet'
Cc: ipp at pwg.org; cloud at pwg.org
Subject: RE: IPP Scan question.
All,
IPP Scan can support multiple document jobs. There are attributes that
allow the printer to declare that capability (
"multiple-document-jobs-supported") as well as operational attributes
("document-number", "last-document") to segment the data pulled from the
scan service into multiple files (i.e. one file per document, number of
images in a file is format and implementation specific). During the
prototype I used a scanner that emitted JPG or PDF. When loading a stack of
media into the ADF each image acquisition resulted in an image. The number
of documents objects generated was dictated by output file type. In the IPP
binding I limited the file to document object association to 1 to 1. I did
not want to deal with the complexities of associating multiple files with a
single document object. The abstract MFD Scan model did allow multiple
files per document.
Running a stack of paper using JPG as the " document-format-accepted"
resulted in a multiple files each of which was associated with a single
document. Running that same stack of paper using PDF as the
"document-format-accepted" resulted in a single multipage file associated
with a single document. From the client perspective using
Get-Next-Document-Images behaved a bit different for each job. With the JPG
output the responses had a document number that changed throughout the scan
job retrieval. The number of responses with the same document number varied
based on the complexity of the image. Each time the document number
changed, the output file is closed and a new one is opened. The last
Get-Next-Document-Images for the last document in the job set the
"last-document" to true. In a push job version of this scan job, the same
number of files are created at the destination. With the PDF output the
responses had a document number remained the same throughout the scan job
retrieval. When the last Get-Next-Document-Images for the job had the
"last-document" to true, the output file was closed. In a push job version
of this scan job, one file was created at the destination.
The MFD Scan model was created with the idea that the same protocol would be
used locally or remotely. Therefore the was considerable more control over
the behavior of the scanner itself. The IPP Scan service simplified a
number of aspects to address the 98% needs for network scanning in a mobile
environment. I expect the MFD Scan service would be adjusted to better
reflect implementation experience within the PWG (i.e., IPP Scan) and in the
industry (e.g., WS-Scan, UPnP Scan, vendor specific scan).
Peter Zehler
PARC, A Xerox Company
800 Phillips Rd, 128-27E
Webster NY, 14580-9701
Email: <mailto:Peter.Zehler at Xerox.com> Peter.Zehler at Xerox.com
Office: +1 (585) 265-8755
Mobile: +1 (585) 329-9508
FAX: +1 (585) 265-7441
-----Original Message-----
From: William A Wagner [ <mailto:wamwagner at comcast.net>
mailto:wamwagner at comcast.net]
Sent: Thursday, September 25, 2014 2:15 PM
To: 'Michael Sweet'
Cc: Zehler, Peter; <mailto:ipp at pwg.org> ipp at pwg.org;
<mailto:cloud at pwg.org> cloud at pwg.org
Subject: RE: IPP Scan question.
Michael,
Thank you for your response.
1. I agree that Figure 3 of the MFD Scan spec definitely indicates that
there can be multiple images in one scan document; I do not see where it
indicates that there cannot be multiple documents is a job. Furthermore,
Figure 4 of that same document (with the associated text) definitely states
that, for a multi-document Job, " Job object contains multiple Document
objects. Each Document can have a different set of processing parameters."
And further that the Scan Service semantic model may allow the End User to
specify a multi-document Job as a service output. If we have intentionally
decided to not consider multi-document jobs in IPP, that should be made
clear. I think it is to be determined if we decide to eliminate them from
the SM3. (Incidentally, I do not see a compelling Use Case for
multi-document Scan Jobs, although some may exist.)
2. I get your explanation that Get-Next-Document-Images refers to multiple
images of a document, and that "last-document" refers to the last image of a
document. But these are names are misleading. Do we use 'Images' to refer to
anything other than 'Document Images'?
I apologize for not commenting on the IPP Scan document earlier, but I think
the one document per job characteristic, despite what one might expect from
the names, should be made more clear. Also, as you suggest, the fact that
for Pull Scan, the GetNextDocumentImages can redefine Compression Accepted
and Document Format Accepted for each image of potentially multiple images
document.
Thanks,
Bill Wagner
-----Original Message-----
From: Michael Sweet [ <mailto:msweet at apple.com> mailto:msweet at apple.com]
Sent: Thursday, September 25, 2014 9:12 AM
To: William A Wagner
Cc: Zehler, Peter; <mailto:ipp at pwg.org> ipp at pwg.org;
<mailto:cloud at pwg.org> cloud at pwg.org
Subject: Re: IPP Scan question.
Bill,
> On Sep 21, 2014, at 9:50 AM, William A Wagner <
<mailto:wamwagner at comcast.net> wamwagner at comcast.net>
wrote:
> ...
> It is also clear from the IPP Scan specification GetNextDocumentImages
operation that a scan job can have multiple documents.
I don't think these are multiple document objects, however.
Get-Next-Document-Images is a convenient way to pull one or more
images/pages from the scanner, but from the point of view of the model they
are part of one document object and would be delivered (in the case of push
scan) as a single file.
>
> The Cloud conference call comment is that FetchJob (corresponding to
> Destination, DestinationAccesses, and InputElements for Scan with no
need to have a FetchDocument operation. This suggests that there is but
one document (possibly with multiple destinations) in a Scan Job.
Alternatively, it may be that the Input Parameters and Destinations for
each one of multiple documents are defined in the CreateJob. This seemes
inconsistent with the general Imaging Service model.
In the case of Scan, the CreateScanJob operation is instantiating a single
scan job containing a single document object that may have multiple digital
representations (e.g. PDF, TIFF, etc.) of the same images. Figure 3 on page
22 of the MFD Scan spec seems pretty clear on that point. This is similar
to how the Copy and FaxIn services work (single document jobs).
Print, FaxOut, and Transform can support multiple digital document inputs
(and thus multiple document objects).
I think the only inconsistency here is that some job services support
multiple document objects and some don't. But I don't think that hurts the
overall model - just something worth pointing out.
(and perhaps as well worth considering/mentioning that most Print and FaxOut
service implementations only support single document jobs...)
> The IPP Scan specification definitely refers to multiple documents in
> one
scan job. However, Figure 1 can be interpreted to mean that the only
operation necessary for Scan is a CreateJob, with GetNextDocumentImages
necessary if it is a Pull Scan Job. Indeed, InputAttributes is defined to be
in the CreateJob request as well as are the Job Template attributes defining
destination; but it does not appear that different InputAttributes and/or
destinations can be specified for different documents.
I think the choice of reusing the "last-document" operation attribute in the
response of Get-Next-Document-Images operation is causing confusion here. It
really is (semantically) "last-document-image".
Pete, do you think this is worth an editorial change before publication,
either the attribute name or the description ("indicating that the last
document IMAGE has been reached")?
> [Also, Compression Accepted and Document Format Accepted are defined
> in CreateJob, but also in GetNextDocumentImages for Pull Scans. Can it
> be assumed that requests in GetNextDocumentImages takes precedence?]
I think this needs some clarification - you put those in Create-Job for a
Push Scan and in Get-Document-Images for a Pull Scan.
> Do I correctly understand that, although there may be multiple
> documents
in a scan job, they must all have the same InputAttributes and the same
destination(s)? An alternate approach might have been to send a
SetDocumentAttributes sent for each document to be scanned, which contained
the input parameters and destination for each specific document/image file;
that would have been consistent with the Model.
Currently you scan whatever is at the input source and send it to the
destination(s) or pull the images with Get-Next-Document-Images. The only
way to break things up is to create multiple jobs and specify the number of
images for each job in the "input-images-to-transfer" member attribute.
> For Cloud, we need to decide whether we should reflect the Semantic
> Model
(with which we should bet be consistent) or the IPP Scan Binding. Or do we
need to change the semantic model?
The intent is that IPP Scan would update the SM definition of SM Scan, since
SM Scan doesn't deal with Pull Scan.
> Also, a few minor editorial comments/questions I had while looking up
stuff.
>
> 1. Table 1 lists Get-Next-Document-Images and
refers to PWG 5100.SCAN. I take it that this means to have the
specification refer to itself, but it is confusing even if the proper number
is inserted. Better to refer to the internal paragraph.
Agreed.
> 2. Figure 1 refers to the operation as
GetNextDocumentImage rather than GetNextDocumentImages
>
> 3. In para 7.1.1, under Group 2: Job Template
Attributes is a reference to section 8.28.1.7.2. There is no such section
(should it be 8.2?)
>
> 4. Although the text makes a distinction between
Print Jobs and Scan Jobs, section 8.2.1.1 refers to a Print Job.
Thanks for catching these!
_________________________________________________________
Michael Sweet, Senior Printing System Engineer, PWG Chair
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pwg.org/pipermail/ipp/attachments/20140926/32632a30/attachment.html>