Other Proposed Steps
This page collects proposed extension steps. Implementation welcome, but contents subject to change at any time.
These steps are in the “proposed extension namespace”,
http://exproc.org/proposed/steps, identified by the prefix
“pxp”.
pxp:nvdl
A step for performing NVDL (Namespace-based Validation Dispatching Language) validation over mixed-namespace documents.
<p:declare-step type="pxp:nvdl">
<p:input port="source" primary="true"/>
<p:input port="nvdl"/>
<p:input port="schemas" sequence="true"/>
<p:output port="result"/>
<p:option name="assert-valid" select="'true'"/> <!-- boolean --></p:declare-step>
The source document is validated using the namespace dispatching rules contained in the nvdl document.
The dispatching rules may contain URI references that point to the actual schemas to be used. As long as these schemas are accessible, it is not necessary to pass anything on the schemas port. However, if one or more schemas are provided on the schemas port, then these schemas should be used in validation.
This requirement is expressed only as a “should” and not a “must” because XProc version 1.0 does not mandate that implementations support caching of documents so that requests for a URI by one step can automatically access the result of some other step if that result had a base URI identical to the requested document.
However, it's not clear that the schemas port has any value if the implementation does not support this behavior.
The value of the assert-valid option
must be a boolean. It is a dynamic error
if the assert-valid option is true
and the input document is not valid.
The output from this step is a copy of the input, possibly augmented by application by schema processing. The output of this step may include PSVI annotations.
pxp:unzip
A step for extracting information out of ZIP archives.
<p:declare-step type="pxp:unzip">
<p:output port="result"/>
<p:option name="href" required="true"/> <!-- anyURI -->
<p:option name="file"/> <!-- string -->
<p:option name="content-type"/> <!-- string --></p:declare-step>
The value of the href option
must be an IRI. It is a dynamic
error if the document so identified does not exist or
cannot be read.
The value of the file option, if specified,
must be the fully qualified path-name of a document
in the archive. It is dynamic error if the
value specified does not identify a file in the archive.
The output from the pxp:unzip step must
conform to the
ziptoc.rnc schema.
If the file option is specified, the selected file
in the archive is extracted and returned:
If the
content-typeis not specified, or if an XML content type is specified, the file is parsed as XML and returned. It is a dynamic error if the file is not well-formed XML.If the
content-typespecified is not an XML content type, the file is base64 encoded and returned in a singlec:dataelement.
If the file option is not specified,
a table of contents for the archive is returned.
For example, the contents of the XML Calabash 0.8.5 distribution archive might be reported like this:
<c:zipfile xmlns:c="http://www.w3.org/ns/xproc-step"
href="http://xmlcalabash.com/download/calabash-0.8.5.zip">
<c:directory name="calabash-0.8.5/" date="2008-11-04T19:29:20.000-05:00"/>
<c:directory name="calabash-0.8.5/docs/" date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="11942" size="36677" name="calabash-0.8.5/docs/CDDL+GPL.txt"
date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="928" size="2110" name="calabash-0.8.5/docs/ChangeLog"
date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="6817" size="17987" name="calabash-0.8.5/docs/GPL.txt"
date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="494" size="830" name="calabash-0.8.5/docs/NOTICES"
date="2008-11-04T19:29:20.000-05:00"/>
<c:directory name="calabash-0.8.5/lib/" date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="389650" size="407421" name="calabash-0.8.5/lib/calabash.jar"
date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="1237" size="2493" name="calabash-0.8.5/README"
date="2008-11-04T19:29:20.000-05:00"/>
<c:directory name="calabash-0.8.5/xpl/" date="2008-11-04T19:29:20.000-05:00"/>
<c:file compressed-size="175" size="255" name="calabash-0.8.5/xpl/pipe.xpl"
date="2008-11-04T19:29:20.000-05:00"/>
</c:zipfile>pxp:zip
A step for creating ZIP archives.
<p:declare-step type="pxp:zip">
<p:input port="source" sequence="true" primary="true"/>
<p:input port="manifest"/>
<p:output port="result"/>
<p:option name="href" required="true"/> <!-- anyURI -->
<p:option name="compression-method"/> <!-- "stored" | "deflated" -->
<p:option name="compression-level"/> <!-- "smallest" | "fastest" | "default" | "huffman" | "none" -->
<p:option name="command" select="'update'"/> <!-- "update" | "freshen" | "create" | "delete" --></p:declare-step>
The ZIP archive is identified by the href. The manifest
(described below) provides the list of files to be processed in the archive.
The command
indicates the nature of the processing: “update”, “freshen”,
“create”, or “delete”.
If files are added to the archive,
compression-method indicates how they should be added: “stored”
or “deflated”. For deflated files, the compression-level identifies
the kind of compression: “smallest”, “fastest”,
“default”, “huffman”, or “none”.
The entries identified by the manifest are processed. The manifest
must conform to the following schema:
default namespace c="http://www.w3.org/ns/xproc-step"
start = zip-manifest
zip-manifest =
element c:zip-manifest {
entry*
}
entry =
element c:entry {
attribute name { text }
& attribute href { text }
& attribute comment { text }?
& attribute method { "deflated" | "stored" }
& attribute level { "smallest" | "fastest" | "huffman" | "default" | "none" }
empty
}For example:
<zip-manifest xmlns="http://www.w3.org/ns/xproc-step">
<entry name="file1.xml" href="http://example.org/file1.xml" comment="An example file"/>
<entry name="path/to/file2.xml" href="http://example.org/file2.xml" method="stored"/>
</zip-manifest>If the command is “delete”, then file1.xml
and path/to/file2.xml will be deleted from the archive. Otherwise, the
file that appears on the source port that has the base URI
http://example.org/file1.xml will be stored in the archive as file1.xml (using
the default method and level),
the file that appears on the source port that has the base URI
http://example.org/file2.xml will be stored in the archive as
path/to/file2.xml without being compressed.
A c:zipfile description of the archive content is produced on the
result port.
pxp:gunzip
Important
Deprecated: See pxp:uncompress.
A step for expanding gzipped data.
<p:declare-step type="pxp:gunzip">
<p:input port="source"/>
<p:output port="result"/></p:declare-step>
If the document that appears on the source port is
base64 encoded, this step will attempt to decode and
gunzip the data. As a convenience, if the data
is not encoded, it is simply passed through, like the p:identity
step.
It is a dynamic error if the resulting, decoded and expanded data is not a well-formed XML document.
pxp:gzip
Important
Deprecated: See pxp:compress.
A step for storing gzip compressed data.
<p:declare-step type="pxp:gzip">
<p:input port="source"/>
<p:output port="result"/>
<p:option name="href"/> <!-- anyURI -->
<p:option name="byte-order-mark"/> <!-- boolean -->
<p:option name="cdata-section-elements" select="''"/> <!-- ListOfQNames -->
<p:option name="doctype-public"/> <!-- string -->
<p:option name="doctype-system"/> <!-- anyURI -->
<p:option name="encoding"/> <!-- string -->
<p:option name="escape-uri-attributes" select="'false'"/> <!-- boolean -->
<p:option name="include-content-type" select="'true'"/> <!-- boolean -->
<p:option name="indent" select="'false'"/> <!-- boolean -->
<p:option name="media-type"/> <!-- string -->
<p:option name="method" select="'xml'"/> <!-- QName -->
<p:option name="normalization-form" select="'none'"/> <!-- NormalizationForm -->
<p:option name="omit-xml-declaration" select="'true'"/> <!-- boolean -->
<p:option name="standalone" select="'omit'"/> <!-- "true" | "false" | "omit" -->
<p:option name="undeclare-prefixes"/> <!-- boolean -->
<p:option name="version" select="'1.0'"/> <!-- string --></p:declare-step>
The pxp:gzip step serializes the document that appears on its
source port and compresses it with gzip.
If the input document is base64 encoded, it is decoded and
the corresponding bytes are compressed.
If the href attribute is present, the step attempts
to store the compressed data to the IRI specified. In this case, it produces a
c:result element on its result port that contains the
IRI where the data was stored.
If the
href attribute is not present, the step returns
the compressed data in a base64 encoded c:data element
with the content type “application/x-gzip”.
pxp:compress
A step for storing compressed data.
<p:declare-step type="pxp:compress">
<p:input port="source"/>
<p:output port="result"/>
<p:option name="href"/> <!-- anyURI -->
<p:option name="compression-method"/> <!-- string -->
<p:option name="byte-order-mark"/> <!-- boolean -->
<p:option name="cdata-section-elements" select="''"/> <!-- ListOfQNames -->
<p:option name="doctype-public"/> <!-- string -->
<p:option name="doctype-system"/> <!-- anyURI -->
<p:option name="encoding"/> <!-- string -->
<p:option name="escape-uri-attributes" select="'false'"/> <!-- boolean -->
<p:option name="include-content-type" select="'true'"/> <!-- boolean -->
<p:option name="indent" select="'false'"/> <!-- boolean -->
<p:option name="media-type"/> <!-- string -->
<p:option name="method" select="'xml'"/> <!-- QName -->
<p:option name="normalization-form" select="'none'"/> <!-- NormalizationForm -->
<p:option name="omit-xml-declaration" select="'true'"/> <!-- boolean -->
<p:option name="standalone" select="'omit'"/> <!-- "true" | "false" | "omit" -->
<p:option name="undeclare-prefixes"/> <!-- boolean -->
<p:option name="version" select="'1.0'"/> <!-- string --></p:declare-step>
The pxp:compress step serializes the document that appears on its
source port and compresses it.
If the input document is base64 encoded, it is decoded and
the corresponding bytes are compressed.
The compression-method option can be used to identify the
compression method used. Suggested values are
“bzip2”,
“compress”,
“gzip”, etc. If unspecified, the default method is
implementation defined.
Note
Would it be better to specify a default? Perhaps gzip?
It is a dynamic error if the method is unrecognized.
If the href attribute is present, the step attempts
to store the compressed data to the IRI specified. In this case, it produces a
c:result element on its result port that contains the
IRI where the data was stored.
If the
href attribute is not present, the step returns
the compressed data in a base64 encoded c:data element
with an appropriate content-type.
pxp:uncompress
A step for expanding compressed data.
<p:declare-step type="pxp:uncompress">
<p:input port="source"/>
<p:output port="result"/>
<p:option name="compression-method"/> <!-- string --></p:declare-step>
If the document that appears on the source port is
base64 encoded, this step will decode and attempt to
uncompress the data. As a convenience, if the data is not encoded, the
XML document is simply passed through, like the p:identity step.
The compression-method option can be used to identify the
compression method used. Suggested values are
“bzip2”,
“compress”,
“gzip”, etc. If unspecified, implementations are free
to attempt to deduce the method from the data.
It is a dynamic error if:
the compression method is unrecognized or
the resulting, decoded and expanded data is not a well-formed XML document.
pxp:set-base-uri
A step for changing the base URI of a document.
<p:declare-step type="pxp:set-base-uri">
<p:input port="source"/>
<p:output port="result"/>
<p:option name="uri" required="true"/> <!-- string --></p:declare-step>
The document that appears on the source port is copied
to the result port. The base URI of the copied document will
be the URI specified in the uri option. If the URI specified
is relative, it will be made absolute with respect to the base URI of
the option element.
