eXProc.org

Other Proposed Steps

This page collects proposed extension steps. Implementation welcome, but contents subject to change at any time.

These steps are in the “proposed extension namespace”, http://exproc.org/proposed/steps, identified by the prefix “pxp”.

pxp:nvdl

A step for performing NVDL (Namespace-based Validation Dispatching Language) validation over mixed-namespace documents.

<p:declare-step type="pxp:nvdl">
     <p:input port="source" primary="true"/>
     <p:input port="nvdl"/>
     <p:input port="schemas" sequence="true"/>
     <p:output port="result"/>
     <p:option name="assert-valid" select="'true'"/>               <!-- boolean -->
</p:declare-step>

The source document is validated using the namespace dispatching rules contained in the nvdl document.

The dispatching rules may contain URI references that point to the actual schemas to be used. As long as these schemas are accessible, it is not necessary to pass anything on the schemas port. However, if one or more schemas are provided on the schemas port, then these schemas should be used in validation.

This requirement is expressed only as a “should” and not a “must” because XProc version 1.0 does not mandate that implementations support caching of documents so that requests for a URI by one step can automatically access the result of some other step if that result had a base URI identical to the requested document.

However, it's not clear that the schemas port has any value if the implementation does not support this behavior.

The value of the assert-valid option must be a boolean. It is a dynamic error if the assert-valid option is true and the input document is not valid.

The output from this step is a copy of the input, possibly augmented by application by schema processing. The output of this step may include PSVI annotations.

pxp:unzip

A step for extracting information out of ZIP archives.

<p:declare-step type="pxp:unzip">
     <p:output port="result"/>
     <p:option name="href" required="true"/>                       <!-- anyURI -->
     <p:option name="file"/>                                       <!-- string -->
     <p:option name="content-type"/>                               <!-- string -->
</p:declare-step>

The value of the href option must be an IRI. It is a dynamic error if the document so identified does not exist or cannot be read.

The value of the file option, if specified, must be the fully qualified path-name of a document in the archive. It is dynamic error if the value specified does not identify a file in the archive.

The output from the pxp:unzip step must conform to the ziptoc.rnc schema.

If the file option is specified, the selected file in the archive is extracted and returned:

  • If the content-type is not specified, or if an XML content type is specified, the file is parsed as XML and returned. It is a dynamic error if the file is not well-formed XML.

  • If the content-type specified is not an XML content type, the file is base64 encoded and returned in a single c:data element.

If the file option is not specified, a table of contents for the archive is returned.

For example, the contents of the XML Calabash 0.8.5 distribution archive might be reported like this:

<c:zipfile xmlns:c="http://www.w3.org/ns/xproc-step"
           href="http://xmlcalabash.com/download/calabash-0.8.5.zip">
   <c:directory name="calabash-0.8.5/" date="2008-11-04T19:29:20.000-05:00"/>
   <c:directory name="calabash-0.8.5/docs/" date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="11942" size="36677" name="calabash-0.8.5/docs/CDDL+GPL.txt"
           date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="928" size="2110" name="calabash-0.8.5/docs/ChangeLog"
           date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="6817" size="17987" name="calabash-0.8.5/docs/GPL.txt"
           date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="494" size="830" name="calabash-0.8.5/docs/NOTICES"
           date="2008-11-04T19:29:20.000-05:00"/>
   <c:directory name="calabash-0.8.5/lib/" date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="389650" size="407421" name="calabash-0.8.5/lib/calabash.jar"
           date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="1237" size="2493" name="calabash-0.8.5/README"
           date="2008-11-04T19:29:20.000-05:00"/>
   <c:directory name="calabash-0.8.5/xpl/" date="2008-11-04T19:29:20.000-05:00"/>
   <c:file compressed-size="175" size="255" name="calabash-0.8.5/xpl/pipe.xpl"
           date="2008-11-04T19:29:20.000-05:00"/>
</c:zipfile>

pxp:zip

A step for creating ZIP archives.

<p:declare-step type="pxp:zip">
     <p:input port="source" sequence="true" primary="true"/>
     <p:input port="manifest"/>
     <p:output port="result"/>
     <p:option name="href" required="true"/>                       <!-- anyURI -->
     <p:option name="compression-method"/>                         <!-- "stored" | "deflated" -->
     <p:option name="compression-level"/>                          <!-- "smallest" | "fastest" | "default" | "huffman" | "none" -->
     <p:option name="command" select="'update'"/>                  <!-- "update" | "freshen" | "create" | "delete" -->
</p:declare-step>

The ZIP archive is identified by the href. The manifest (described below) provides the list of files to be processed in the archive. The command indicates the nature of the processing: “update”, “freshen”, “create”, or “delete”.

If files are added to the archive, compression-method indicates how they should be added: “stored” or “deflated”. For deflated files, the compression-level identifies the kind of compression: “smallest”, “fastest”, “default”, “huffman”, or “none”.

The entries identified by the manifest are processed. The manifest must conform to the following schema:

default namespace c="http://www.w3.org/ns/xproc-step"

start = zip-manifest

zip-manifest =
   element c:zip-manifest {
      entry*
   }

entry =
   element c:entry {
      attribute name { text }
    & attribute href { text }
    & attribute comment { text }?
    & attribute method { "deflated" | "stored" }
    & attribute level { "smallest" | "fastest" | "huffman" | "default" | "none" }
      empty
   }

For example:

<zip-manifest xmlns="http://www.w3.org/ns/xproc-step">
  <entry name="file1.xml" href="http://example.org/file1.xml" comment="An example file"/>
  <entry name="path/to/file2.xml" href="http://example.org/file2.xml" method="stored"/>
</zip-manifest>

If the command is “delete”, then file1.xml and path/to/file2.xml will be deleted from the archive. Otherwise, the file that appears on the source port that has the base URI http://example.org/file1.xml will be stored in the archive as file1.xml (using the default method and level), the file that appears on the source port that has the base URI http://example.org/file2.xml will be stored in the archive as path/to/file2.xml without being compressed.

A c:zipfile description of the archive content is produced on the result port.