Other Proposed Steps
This page collects proposed extension steps. Implementation welcome, but contents subject to change at any time.
These steps are in the “proposed extension namespace”,
http://exproc.org/proposed/steps
, identified by the prefix
“pxp
”.
pxp:nvdl
A step for performing NVDL (Namespace-based Validation Dispatching Language) validation over mixed-namespace documents.
<p:declare-step
type
="
pxp:nvdl
"
>
<p:input
port
="
source
"
primary
="
true
"
/>
<p:input
port
="
nvdl
"
/>
<p:input
port
="
schemas
"
sequence
="
true
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
assert-valid
"
select
="
'true'
"
/>
<!--
boolean -->
</p:declare-step>
The source document is validated using the namespace dispatching rules contained in the nvdl document.
The dispatching rules may contain URI references that point to the actual schemas to be used. As long as these schemas are accessible, it is not necessary to pass anything on the schemas port. However, if one or more schemas are provided on the schemas port, then these schemas should be used in validation.
This requirement is expressed only as a “should” and not a “must” because XProc version 1.0 does not mandate that implementations support caching of documents so that requests for a URI by one step can automatically access the result of some other step if that result had a base URI identical to the requested document.
However, it's not clear that the schemas port has any value if the implementation does not support this behavior.
The value of the assert-valid
option
must be a boolean. It is a dynamic error
if the assert-valid
option is true
and the input document is not valid.
The output from this step is a copy of the input, possibly augmented by application by schema processing. The output of this step may include PSVI annotations.
pxp:unzip
A step for extracting information out of ZIP archives.
<p:declare-step
type
="
pxp:unzip
"
>
<p:output
port
="
result
"
/>
<p:option
name
="
href
"
required
="
true
"
/>
<!--
anyURI -->
<p:option
name
="
file
"
/>
<!--
string -->
<p:option
name
="
content-type
"
/>
<!--
string -->
</p:declare-step>
The value of the href
option
must be an IRI. It is a dynamic
error if the document so identified does not exist or
cannot be read.
The value of the file
option, if specified,
must be the fully qualified path-name of a document
in the archive. It is dynamic error if the
value specified does not identify a file in the archive.
The output from the pxp:unzip
step must
conform to the
ziptoc.rnc schema.
If the file
option is specified, the selected file
in the archive is extracted and returned:
If the
content-type
is not specified, or if an XML content type is specified, the file is parsed as XML and returned. It is a dynamic error if the file is not well-formed XML.If the
content-type
specified is not an XML content type, the file is base64 encoded and returned in a singlec:data
element.
If the file
option is not specified,
a table of contents for the archive is returned.
For example, the contents of the XML Calabash 0.8.5 distribution archive might be reported like this:
pxp:zip
A step for creating ZIP archives.
<p:declare-step
type
="
pxp:zip
"
>
<p:input
port
="
source
"
sequence
="
true
"
primary
="
true
"
/>
<p:input
port
="
manifest
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
href
"
required
="
true
"
/>
<!--
anyURI -->
<p:option
name
="
compression-method
"
/>
<!--
"stored" | "deflated" -->
<p:option
name
="
compression-level
"
/>
<!--
"smallest" | "fastest" | "default" | "huffman" | "none" -->
<p:option
name
="
command
"
select
="
'update'
"
/>
<!--
"update" | "freshen" | "create" | "delete" -->
</p:declare-step>
The ZIP archive is identified by the href
. The manifest
(described below) provides the list of files to be processed in the archive.
The command
indicates the nature of the processing: “update
”, “freshen
”,
“create
”, or “delete
”.
If files are added to the archive,
compression-method
indicates how they should be added: “stored
”
or “deflated
”. For deflated files, the compression-level
identifies
the kind of compression: “smallest
”, “fastest
”,
“default
”, “huffman
”, or “none
”.
The entries identified by the manifest
are processed. The manifest
must conform to the following schema:
default namespace c="http://www.w3.org/ns/xproc-step"
start = zip-manifest
zip-manifest =
element c:zip-manifest {
entry*
}
entry =
element c:entry {
attribute name { text }
& attribute href { text }
& attribute comment { text }?
& attribute method { "deflated" | "stored" }
& attribute level { "smallest" | "fastest" | "huffman" | "default" | "none" }
empty
}
For example:
<zip-manifest xmlns="http://www.w3.org/ns/xproc-step">
<entry name="file1.xml" href="http://example.org/file1.xml" comment="An example file"/>
<entry name="path/to/file2.xml" href="http://example.org/file2.xml" method="stored"/>
</zip-manifest>
If the command
is “delete
”, then file1.xml
and path/to/file2.xml
will be deleted from the archive. Otherwise, the
file that appears on the source port that has the base URI
http://example.org/file1.xml
will be stored in the archive as file1.xml
(using
the default method and level),
the file that appears on the source port that has the base URI
http://example.org/file2.xml
will be stored in the archive as
path/to/file2.xml
without being compressed.
A c:zipfile
description of the archive content is produced on the
result port.
pxp:gunzip
Important
Deprecated: See pxp:uncompress.
A step for expanding gzipped data.
<p:declare-step
type
="
pxp:gunzip
"
>
<p:input
port
="
source
"
/>
<p:output
port
="
result
"
/>
</p:declare-step>
If the document that appears on the source port is
base64
encoded, this step will attempt to decode and
gunzip the data. As a convenience, if the data
is not encoded, it is simply passed through, like the p:identity
step.
It is a dynamic error if the resulting, decoded and expanded data is not a well-formed XML document.
pxp:gzip
Important
Deprecated: See pxp:compress.
A step for storing gzip compressed data.
<p:declare-step
type
="
pxp:gzip
"
>
<p:input
port
="
source
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
href
"
/>
<!--
anyURI -->
<p:option
name
="
byte-order-mark
"
/>
<!--
boolean -->
<p:option
name
="
cdata-section-elements
"
select
="
''
"
/>
<!--
ListOfQNames -->
<p:option
name
="
doctype-public
"
/>
<!--
string -->
<p:option
name
="
doctype-system
"
/>
<!--
anyURI -->
<p:option
name
="
encoding
"
/>
<!--
string -->
<p:option
name
="
escape-uri-attributes
"
select
="
'false'
"
/>
<!--
boolean -->
<p:option
name
="
include-content-type
"
select
="
'true'
"
/>
<!--
boolean -->
<p:option
name
="
indent
"
select
="
'false'
"
/>
<!--
boolean -->
<p:option
name
="
media-type
"
/>
<!--
string -->
<p:option
name
="
method
"
select
="
'xml'
"
/>
<!--
QName -->
<p:option
name
="
normalization-form
"
select
="
'none'
"
/>
<!--
NormalizationForm -->
<p:option
name
="
omit-xml-declaration
"
select
="
'true'
"
/>
<!--
boolean -->
<p:option
name
="
standalone
"
select
="
'omit'
"
/>
<!--
"true" | "false" | "omit" -->
<p:option
name
="
undeclare-prefixes
"
/>
<!--
boolean -->
<p:option
name
="
version
"
select
="
'1.0'
"
/>
<!--
string -->
</p:declare-step>
The pxp:gzip
step serializes the document that appears on its
source port and compresses it with gzip.
If the input document is base64
encoded, it is decoded and
the corresponding bytes are compressed.
If the href
attribute is present, the step attempts
to store the compressed data to the IRI specified. In this case, it produces a
c:result
element on its result port that contains the
IRI where the data was stored.
If the
href
attribute is not present, the step returns
the compressed data in a base64
encoded c:data
element
with the content type “application/x-gzip
”.
pxp:compress
A step for storing compressed data.
<p:declare-step
type
="
pxp:compress
"
>
<p:input
port
="
source
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
href
"
/>
<!--
anyURI -->
<p:option
name
="
compression-method
"
/>
<!--
string -->
<p:option
name
="
byte-order-mark
"
/>
<!--
boolean -->
<p:option
name
="
cdata-section-elements
"
select
="
''
"
/>
<!--
ListOfQNames -->
<p:option
name
="
doctype-public
"
/>
<!--
string -->
<p:option
name
="
doctype-system
"
/>
<!--
anyURI -->
<p:option
name
="
encoding
"
/>
<!--
string -->
<p:option
name
="
escape-uri-attributes
"
select
="
'false'
"
/>
<!--
boolean -->
<p:option
name
="
include-content-type
"
select
="
'true'
"
/>
<!--
boolean -->
<p:option
name
="
indent
"
select
="
'false'
"
/>
<!--
boolean -->
<p:option
name
="
media-type
"
/>
<!--
string -->
<p:option
name
="
method
"
select
="
'xml'
"
/>
<!--
QName -->
<p:option
name
="
normalization-form
"
select
="
'none'
"
/>
<!--
NormalizationForm -->
<p:option
name
="
omit-xml-declaration
"
select
="
'true'
"
/>
<!--
boolean -->
<p:option
name
="
standalone
"
select
="
'omit'
"
/>
<!--
"true" | "false" | "omit" -->
<p:option
name
="
undeclare-prefixes
"
/>
<!--
boolean -->
<p:option
name
="
version
"
select
="
'1.0'
"
/>
<!--
string -->
</p:declare-step>
The pxp:compress
step serializes the document that appears on its
source port and compresses it.
If the input document is base64
encoded, it is decoded and
the corresponding bytes are compressed.
The compression-method
option can be used to identify the
compression method used. Suggested values are
“bzip2
”,
“compress
”,
“gzip
”, etc. If unspecified, the default method is
implementation defined.
Note
Would it be better to specify a default? Perhaps gzip
?
It is a dynamic error if the method is unrecognized.
If the href
attribute is present, the step attempts
to store the compressed data to the IRI specified. In this case, it produces a
c:result
element on its result port that contains the
IRI where the data was stored.
If the
href
attribute is not present, the step returns
the compressed data in a base64
encoded c:data
element
with an appropriate content-type
.
pxp:uncompress
A step for expanding compressed data.
<p:declare-step
type
="
pxp:uncompress
"
>
<p:input
port
="
source
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
compression-method
"
/>
<!--
string -->
</p:declare-step>
If the document that appears on the source port is
base64
encoded, this step will decode and attempt to
uncompress the data. As a convenience, if the data is not encoded, the
XML document is simply passed through, like the p:identity
step.
The compression-method
option can be used to identify the
compression method used. Suggested values are
“bzip2
”,
“compress
”,
“gzip
”, etc. If unspecified, implementations are free
to attempt to deduce the method from the data.
It is a dynamic error if:
the compression method is unrecognized or
the resulting, decoded and expanded data is not a well-formed XML document.
pxp:set-base-uri
A step for changing the base URI of a document.
<p:declare-step
type
="
pxp:set-base-uri
"
>
<p:input
port
="
source
"
/>
<p:output
port
="
result
"
/>
<p:option
name
="
uri
"
required
="
true
"
/>
<!--
string -->
</p:declare-step>
The document that appears on the source port is copied
to the result port. The base URI of the copied document will
be the URI specified in the uri
option. If the URI specified
is relative, it will be made absolute with respect to the base URI of
the option element.