Sablotron(3) User Contributed Perl Documentation Sablotron(3)NAMEXML::Sablotron - a Perl interface to the Sablotron XSLT processor
SYNOPSIS
use XML::Sablotron qw (:all);
Process(.....);
If you prefer an object approach, you can use the object wrapper:
$sab = new XML::Sablotron();
$sab->runProcessor($template_url, $data_url, $output_url,
\@params, \@arguments);
$result = $sab->getResultArg($output_url);
Note, that the Process function as well as the SablotProcess function
are deprecated. See the "USAGE" section for more details.
DESCRIPTION
This package is a interface to the Sablotron API.
Sablotron is an XSLT processor implemented in C++ based on the Expat
XML parser.
If want to run this package, you need download and install Sablotron
from the http://www.gingerall.cz/charlie-bin/get/webGA/act/download.act
page. The Expat XML parser is needed by Sablotron (http://expat.source-
forge.net)
See Sablotron documentation for more details.
You do _not_ need to download any other Perl packages to run the
XML::Sablotron package.
Since version 0.60 Sablotron supports DOM Level2 methods to access
parsed trees, modify them and process them, as well as serialize them
into files etc. The DOM trees are not dependent on the processor
object, so you may use them for data or stylesheet caching.
USAGE
Generally there are two modes how you may use Sablotron. The first one
(and the simplest one) is based on procedural calls, the second one is
based on object oriented interface.
Note, that the original procedural interface is deprecated and should
not be used.
Procedural Model
There are two methods exported from the XML::sablotron package: Pro-
cessString and Process. As we mentioned above, these function are dep-
recated and shouldn't be used. Many Sablotron features as miscellaneous
handlers, DOM model etc. are not available trough this interface. See
the "Exported Function" for the usage of these procedures.
Object Interface
There are two classes defined to deal with the Sablotron processor
object.
"XML::Sablotron::Processor" is a class implementing an interface to the
Sablotron processor object. Multiple concurrent processors are sup-
ported, so you may use Sablotron in multithreaded programs easily.
Implementation of this class contains a circular reference inside Perl
structures, which has to be broken calling the "_release" method. If
you aren't going to do some hacks to this package, you don't need to
use this mechanism directly.
"XML::Sablotron" is often the only thing you need. It's a wrapper
around the XML::Sablotron::Processor object. The only quest of this
class is to keep track of life-cycle of the processor, so you don't
have to deal with a reference counting inside the processor class. All
calls to this class are redirected to an inner instance of the
XML::Sablotron::Processor object.
As an addition to previous version of XML::Sablotron, there are new
interface methods. We strongly recommend you to use that new methods.
Previous versions used the RunProcessor method, which had been called
with many parameters specifying XSL params, processed buffers and URLs.
New interface methods are more intuitive to use and, and this is
extremely important, they allow to process preparsed DOM document as
well as the new ones.
New methods are:
* addArg
* addArgTree
* addParam
* process
See references for more.
API NAME CHANGES
Since the release 0.60 all API uses unique naming convention. Names
starts with lower case letter, first letters of following words are
capitalized. Older user don't have to panic, since old names are kept
for the compatibility.
SITUATION
Since the release 0.60 there is new object (user internally in previous
versions) used for several tasks. In this Perl module is represented by
the XML::Sablotron::Situation package.
At this time the situation is used only for error tracking, but in fur-
ther releases its usage will become quite extensive. (It will be used
for all handlers etc.)
So far you don't have (and it is not even possible many times) to use
the Situation object for processing the data. There is one exception to
this. If you use the DOM interface (XML::Sablotron::DOM module), you
have to create and use the situation object like this:
$situa = new XML::Sablotron::Situation;
EXPORTED FUNCTIONS
ProcessStrings - deprecated
"ProcessStrings($template, $data, $result);"
where...
$template
contains an XSL stylesheet
$data
contains an XML data to be processed
$result
is filled with the desired output
This function returns the Sablotron error code.
Process - deprecated
This function provides a more general interface to Sablotron. You may
find its usage a little bit tricky but it offers a variety of ways how
to modify the Sablotron behavior.
Process($template_uri, $data_uri, $result_uri,
$params, $buffers, $result);
where...
$template_uri
is a URI of XSL stylesheet
$data_uri
is a URI of processed data
$result_uri
is a URI of destination buffer. Currently, the arg: scheme is sup-
ported only. Use the value arg:/result. (the name of the $result
variable without "$" sign)
$params
is a reference to array of global stylesheet parameters
$buffers
is a reference to array of named buffers
$result
receives the result. It requires $result_uri to be set to
arg:/result.
The following example should make it clear.
Process("arg:/template", "arg:/data", "arg:/result",
undef,
["template", $template, "data", $data],
$result);>
does exactly the same as
ProcessStrings($template, $data, $result);>
Why is it so complicated? Please, see the Sablotron documentation for
details.
This function returns the Sablotron error code.
RegMessageHandler - canceled
This function is deprecated and no longer supported. See the descrip-
tion of object interface later in this document.
UnregMessageHandler - canceled
This function is deprecated and no longer supported. See the descrip-
tion of object interface later in this document.
XML::Sablotron
new
The constructor of the XML::Sablotron object takes no arguments, so you
can create new instance simply like this:
$sab = new XML::Sablotron();
addArg
Add an argument to the processor. Nothing (almost) happened at the time
of call, but this argument may be processed later by the "process"
function.
$sab->addArg($situa, $name, $data);
$situa
The situation to be used.
$name
The name of the buffer in the "arg:" scheme.
$data
The literal XML data to be parsed and remembered.
addArgTree
Add a DOM document to the processor. This document may be processed
later with the "process" call.
$sab->addArgTree($situa, $name, $doc);
$situa
The situation to be used.
$name
The name of the buffer in the "arg:" scheme.
$doc
The DOM document. Must be a XML::Sablotron::DOM::Document instance.
addParam
Adds the XSL parameter to the processor. The parameter may be accessed
later by the "process" call.
$sab->addParam($situa, $name, $value);
$situa
The situation to be used.
$name
The name of the parameter.
$value
The value of the parameter.
process
This function starts the XSLT processing over the formerly specified
data. Data are added to the processor using "addArg", "addArgTree" and
"addParam" methods.
$sab->process($situa, $template_uri, $data_uri, $result_uri);
$situa
The situation to be used.
$template_uri
The URI of XSL stylesheet
$data_uri
The URI of processed data
$result_uri
The a URI of destination buffer
runProcessor
The RunProcessor is the older method analogous to the Process function.
You may find it useful, but the use of the "process" method is recom-
mended.
$code = $sab->runProcessor($template_uri, $data_uri, $result_uri,
$params, $buffers);
where...
$template_uri
is a URI of XSL stylesheet
$data_uri
is a URI of processed data
$result_uri
is a URI of destination buffer
$params
is a reference to array of global stylesheet parameters
$buffers
is a reference to array of named buffers
URIs passed to this function may be from schemes supported internally
(file:, arg:) of from any scheme handled by registered handler (see
"HANDLERS" section).
Note the difference between the RunProcessor method and the Process
function. RunProcessor doesn't return the output buffer ($result param-
eter is missing).
To obtain the result buffer(s) you have to call the "getResultArg"
method.
Example of use:
$sab->runProcessor("arg:/template", "arg:/data", "arg:/result",
undef,
["template", $template, "data", $data] );
getResultArg
Call this function to obtain the result buffer after processing. The
goal of this approach is to enable multiple output buffers.
$result = $sab->getResultArg($output_url);
This method returns a desired output buffer specified by its url. Spec-
ifying the "arg:" scheme in URI is optional.
The recent example of the runProcessor method should continue:
$return = $sab->getResultArg("result");
freeResultArgs
$sab->freeResultArgs();
This call frees up all output buffers allocated by Sablotron. You do
not have to call this function as these buffers are managed by the pro-
cessor internally.
Use this function to release huge chunks of memory while an instance of
processor stays idle for a longer time.
regHandler
Set particular type of an external handler. The processor can use the
handler for miscellaneous tasks such log and error hooking etc.
For more details on handlers see the "HANDLERS" section of this docu-
ment.
There are two ways how to call the RegHandler method:
$sab->regHandler($type, $handler);
where...
$type
is the handler type (see "HANDLERS")
$handler
is an object implementing the handler interface
The second way allows to create anonymous handlers defined as a set of
function calls:
$sab->regHandler($type, { handler_stub1 => \&my_proc1,
handlerstub2 => \&my_proc2.... });
However, this form is very simple. It disallows to unregister the han-
dler later.
For the detailed description of handler interface see the Handlers sec-
tion.
unregHandler
$sab->unregHandler($type, $handler);
This method unregisters a registered handler.
Remember, that anonymously registered handlers can't be unregistered.
set/getEncoding
$sab->setEncoding($encoding);
Calling these methods has no effect. They are valuable for miscella-
neous handler, which may store received values together with the pro-
cessor instance.
set/getContentType
$sab->setEContentType($content_type);
Calling these methods has no effect. They are valuable for miscella-
neous handler, which may store received values together with the pro-
cessor instance.
setOutputEncoding
$sab->setOutputEncoding($encoding);
This methods allows to override the encoding specified in the <xsl:out-
put> instruction. It enables to produce differently encoded outputs
using one template.
setBase
$sab->setBase($base_url);
Call this method to make processor to use the $base_url base URI while
resolving any relative URI within a data or template.
setBaseForScheme
$sab->setBaseForScheme($scheme, $base);
Like "SetBase", but given base URL is used only for specified scheme.
setLog
$sab->setLog($filename, $level);
This methods sets the log file name, and the log level. See "Messages
handler - overview" for details on log levels.
clearError
$sab->clearError();
This methods clears the last internal error of processor.
XML::Sablotron::Situation
Sablotron performs almost all operations in very special context used
for the error tracing. This is useful for multithreaded programing or
if you need called Sablotron in the reentrant way.
The tax you have to pay for it is the need of specifying this context
in many calls. Using DOM access to Sablotron structures requires this
approach almost for every call.
The "XML::Sablotron::Situation" object represents the execution con-
text.
E.g. if you want to create new DOM document, you have to do following:
$situa = new XML::Sablotron::Situation();
$doc = new XML::Sablotron::DOM::Document(SITUATION => $situa);
The situation object supports several methods you may use if you want
to get more details on error happened.
(Note: In upcoming releases the Situation object will be used for more
tasks like handler registering etc.)
setOptions
$sit->setOptions($options);
Control some processing features. The $options parameter may be any
combination of following constants:
* SAB_NO_ERROR_REPORTING
supress error reporting
* SAB_PARSE_PUBLIC_ENTITIES
forces parser to parse all external entities (even public ones)
* SAB_DISABLE_ADDING_META
suppress outputting of the meta tag (html method)
getDOMExceptionCode
Returns the last error code.
getDOMExceptionMessage
Returns the string characterizing the last occurred error.
getDOMExceptionDetails
Returns ARRAYREF with several details on the most recent error. See
example:
$arr = $situa->getExceptionDetails();
($code, $message, $uri, $line) = @$arr;
HANDLERS
Currently, Sablotron supports four types of handlers.
* messages handler (0)
* scheme handler (1)
* SAX-like output handler (2)
* miscellaneous handler (3)
General interface format
Call-back functions implementing handlers are of different prototypes
(not a prototypes in the Perl meaning) but the first two parameters are
always the same:
$self
is a reference to registered object, so you can implement handlers
the common object way. If you register a handler with a hash refer-
ence (see "RegHandler", this parameter refers to a hidden object,
which is useless for you.
$processor
is reference to the processor, which is actually calling your han-
dler. It allows you to use one handler for more than one processor.
Messages handler - overview
The goal of this handler is to deal with all messages produced by a
processor.
Each state reported by the processor is composed of the following data:
* severity
zero means: not so bad thing; 1 means: OOPS, bad thing
* facility
Helps to determine who is reporting in larger systems. Sablotron
always sets this value to 2.
* code
An internal Sablotron code.
Each reported event falls into one of predefined categories, which
define the event level. The valid levels include:
* debug (0)
all stuff
* info (1)
informations for curious people
* warn (2)
warnings on suspicious things
* error (3)
huh, something is wrong
* critical (4)
very, very bad day...
The numbers in the parentheses are the internal level codes.
Messages handler - interface
To define a messages handler, you have to define the following func-
tions (or methods, depending on kind of registration, see "RegHan-
dler").
MHMakeCode($self, $processor, $severity, $facility, $code)
This function is called whenever Sablotron needs display any mes-
sage. It helps you to convert the internal codes into your own
space of numbers. After this call Sablotron forgets its code and
use the yours.
To understand parameters of this call see: "Messages handler -
overview"
MHLog($self, $processor, $code, $level, @fields)
A Sablotron request to log some event.
$code
is the code previously returned by MHMakeCode
$level
is the event level (see "Messages handler - overview")
@fields
are text fields in format of "fldname: following text"
MHError($self, $processor, $code, $level, @fields)
is very similar to the MHLog function but it is called only when a
bad thing happens (error and critical levels).
Messages handler - example
A very simple message handler could look like this:
sub myMHMakeCode {
my ($self, $processor, $severity, $facility, $code);
return $code; # I can deal with internal numbers
}
sub myMHLog {
my ($self, $processor, $code, $level, @fields);
print LOGHANDLE "[Sablot: $code]\n" . (join "\n", @fields, "");
}
sub myMHError {
myMHlog(@_);
die "Dying from Sablotron errors, see log\n";
}
$sab = new XML::Sablotron();
$sab->RegHandler(0, { MHMakeCode => \&myMHMakeCode,
MHLog => \&myMHLog,
MHError => \&myMHError });
That's all, folks.
Scheme handler - overview
One of great features of Sablotron is the possibility of Scheme han-
dlers. This feature allows to reference data from any URL scheme. Every
time the processor is asked for some URI (e.g. using the document()
function), it looks for a handler, which can resolve the required docu-
ment.
Sablotron asks the handler for all the document at once. If the handler
refuses this request, Sablotron "opens" a connection to the handler and
tries to read the data "per partes".
A handler can be used for the output buffers as well, so this mechanism
also supports the "put" method.
Scheme handler - interface
SHGetAll($self, $processor, $scheme, $rest)
This function is called, when the processor is trying to resolve a
document. It supposes, that the MHGetAll function returns the whole
document.
If you're going to use the second way (giving chunks of the docu-
ment), simply don't implement this function or return the "undef"
value from it.
$scheme parameter holds the scheme extracted from a URI
$rest holds the rest of the URI
SHOpen($self, $processor, $scheme, $rest)
This function is called immediately after SHGet or SHPut is called.
Use it to pass some "handle" (I mean a user data) to the processor.
This data will be a part of each following request (SHGet, SHPut).
SHGet($self, $processor, $handle, $size)
This function returns the following chunk of data. The size of the
data MUST NOT be greater then the $size parameter.
$handle is the value previously returned from the SHOpen function.
Return the "undef" value to say "No more data".
SHPut($self, $processor, $handle, $data)
This function stores a chunk of data given in the $data parameter.
SHClose($self, $processor, $handle)
You can close you internal connections, files, etc. using this
function.
Scheme handler - example
See the test script (test.pl) included in this distribution.
SAX handler - overview
Sablotron supports both of physical (file, buffer) and event based out-
put methods. SAX handler is a bit confusing name, because events pro-
duced by the engine are of a bit different flavors then 'real' SAX
events; think about this feature as about SAX-like handler.
You may set this handler if you want to catch output events and process
them as you wish. Note, that there is XML::SAXDriver::Sablotron module
available, so you don't need to deal with the SAX-like handler, if you
want to use Sablotron as standard SAX driver.
SAX handler - interface
SAXStartDocument($self, $proc)
Event called at the very beginning of the output.
SAXStartNamespace($self, $proc, $prefix, $uri)
Event called when a new namespace declaration occurs.
SAXEndNamespace($self, $proc, $prefix)
Event called when a namespace declaration runs out of the scope.
Note, that introducing and canceling namespaces don't have to be
properly nested.
SAXStartElement($self, $proc, $name, %atts)
Event called when an element is started. Name and attribute values
are provided.
SAXEndElement($self, $proc, $name)
Event called when an element is closed. Called before namespaces
run out of the scope.
SAXCharacters($self, $proc, $data)
Event called when data are output.
SAXComment($self, $proc, $data)
Event called when a comment occurs.
SAXPI($self, $proc, $target, $data)
Event called when processing instruction occurs.
SAXEndDocument($self, $proc)
Event called at the very end of the document.
Miscellaneous handler - overview
This handler was introduced in version 0.42 and could be subject of
change in the near future. For the namespace collision with message
handler misc. handler uses prefix 'XS' (like extended features).
Miscellaneous handler - interface
XHDocumentInfo($self, $processor, $contentType, $encoding)
This function is called, when document attributes are specified via
<xsl:output> instruction. $contentType holds value of "media-type"
attribute, $encoding holds value of "encoding attribute.
Return value of this callback is discarded.
Miscellaneous handler - example
Suppose template like this:
<?xml version='1.0'?>
...
<xsl:output media-type="text/html" encoding="iso-8859-2"/>
...
In this case XSDocumentInfo callback function is called with values of
"text/html" and "iso-8859-2".
LICENSE
This package is subject to the MPL (or the GPL alternatively).
The same licensing applies for Sablotron.
AUTHOR
Pavel Hlavnicka; pavel@gingerall.cz
SEE ALSOperl(1).
perl v5.8.8 2004-06-04 Sablotron(3)