XML Serialization and Deserialization

Main Document
Overview
Namespaces and Prefixes
XML Serialization
Classes
Instances
Rules
XML Schema Built-in Datatypes
XML Deserialization
Processing of XML Schema Subelements
General Parsing Interface
Parsing of XML Schema Built-in Datatypes
Integration with Specific Parsers
Parse Tree Traversal
SAX

Overview

Classes (as an XML schema) and instances, including rules, may be serialized. (Those unfamiliar with XML should read the following in conjunction with XML Schema Part 0: Primer Second Edition.)

An interface is provided that can be used in conjunction with a parser to perform the reverse operation.

All symbols referred to below are accessible in the package REASONER-EXT (nickname RS-EXT) and external, unless otherwise indicated.

Namespaces and Prefixes

XML names that belong to a particular namespace are stored in a Lisp package associated with a namespace name (a symbol); this name (converted to a lower case string) is used to qualify the names of the namespace in serialized output.

Associates namespace-name with uri and a set of Lisp-style, symbol or string, names.

Makes a new namespace, associating it with uri. The package argument is used to associate the namespace with an existing Lisp package. Signals an error if a package is already associated with the namespace.

Returns the package associated with a namespace name, setting it if none exists.

Used by ensure-namespace-package when creating a Lisp package associated with a namespace name. Initially :xml.

Retrieves or sets the package associated with a namespace name.

Retrieves or sets the uri (a string) associated with a namespace name.

XML Serialization

Names are formatted in the standard XML mixed-case style.

If non-nil (the default), complexType names have their initial character capitalized.

Serialized output is sent to *standard-output*.

Redirects *standard-output* to stream.

Determines whether elements are printed on separate lines and indented; initially non-nil.

Determines how far inner elements are indented; initially 2.

Determines whether, within a schema or ruleset, a blank line is inserted between definitions; initially non-nil.

Classes

Incorporates the output of the forms in a schema element. documentation is a documentation string, used to annotate the schema; documentation-lang defaults to :en.

target-namespace is a prefix used to determine the value of the targetNamespace attribute of the schema element, and to create a temporary binding of *target-namespace*. Defaults to :xsd, denoting the XML Schema namespace.

namespaces is used to create a temporary binding of *namespaces*. Defaults to (list :xsd).

default-namespace is used to create a temporary binding of *default-namespace*.

The values of the schema attributes elementFormDefault and attributeFormDefault are governed by *qualified-local-elements* and *qualified-local-attributes*.

Outputs a top-level element declaration. If type is omitted, it will be the same as name.

Serializes the portion of the class hierarchy of which root is the root, as a sequence of complexType (simpleType if it is a range-class) elements. format defaults to :xml.

If format is :xml, signals an error if any of the classes have multiple superclasses, unless tangled is non-nil. In this case, the rightmost superclass is used as the base type and the derived type will contain all the elements (i.e., slot definitions) inherited from the other superclasses.

If derived is non-nil, a derived type will be output for the root class.

Outputs a derived (unless derived is nil) complexType definition.

If a class both restricts and extends the slot definitions of a superclass, two derived type definitions will be output, with the extended type being the subordinate of the restricted type.

Used to construct the restricted type name from the class name; initially :restricted.

Outputs simpleType definitions derived from integer if a subclass of numeric-range, or string otherwise.

If non-nil, simpleType definitions will be list types; if nil (the default), atomic types.

Also affects the serialization of instances: in the latter case if a range has not been narrowed to a single value there will be no element content and the nil attribute of the element will be set, whereas in the former case element content will comprise a list of values.

Instances

Precedes output of forms with XML header. version defaults to 1.0.

target-namespace is a prefix indicating the target namespace declared in the schema, if any; used to create a temporary binding of *target-namespace*.

namespaces and default-namespace are used to create temporary bindings for *namespaces* and *default-namespace*, respectively.

Prefix that should be used to qualify elements and attributes (and type references in the schema), unless it is also the default namespace. (*qualified-local-elements* and *qualified-local-attributes* are used to exert further control over qualification.) Is added at the front of *namespaces*, if not already present.

A list of prefixes corresponding to the namespaces that should be declared at the beginning of an element. Names belonging to a namespace in this list will be qualified, unless it is also the default namespace. Initially set to (list :xsi) (:xsi denotes the XMLSchema instance namespace).

When determining whether to qualify a name, this list is first searched for a namespace which has an associated Lisp package that is the same as the name’s home package. Upon failure, it is searched again, for the first package in which there is a symbol with the same symbol-name as the name.

Prefix indicating the namespace, if any, that should be declared as the default at the beginning of an element.

Determine whether locally defined element and attribute names belonging to the target namespace are qualified. Correspond to the schema attributes elementFormDefault and attributeFormDefault. Both are initially nil.

The name of the uniquely-identifying (unqualified) attribute that will be associated with elements corresponding to named instances; initially :name.

Recursively serializes object and all objects to which it refers. Does not detect circularities (but see serialize-slot).

tag and type are the name and type, respectively, of the corresponding element (root element if global is non-nil).

global indicates whether the element corresponding to object is defined globally in the schema. Defaults to non-nil.

If global is non-nil, namespace declarations are added according to the values of *namespaces* and *default-namespace*, and the element itself will be qualified with *target-namespace*, if non-nil and not the default namespace.

global is bound to nil for recursively serialized objects. The qualification of local elements and attributes is governed by *qualified-local-elements* and *qualified-local-attributes*.

If a slot refers to fewer instances than indicated by the lower bound of the :count slot option in the instance’s class definition, then the missing instances will be created beforehand. If there is no such slot option, a value of 1 is used.

Designates a function of one argument, a class name, that is used to generate unique names for instances.

Serializes the contents of a slot. slot is a slot definition metaobject; other arguments are the same as for serialize-object.

A method can be supplied in order to inhibit the serialization of a particular slot.

Called to format an element of a range. datatype is either a datatype associated with the range, or the name of the range itself. See defrange. The default method returns an XML-style name, if appropriate; if not, the object itself.

Called to print an element of a range. The default method calls princ on the value returned by format-as-type.

Examines *use-list-simple-types* and returns the value to be passed to print-as-type. Should be overridden if a print-as-type method always requires access to all elements of a range.

Rules

An XML schema for the rule language is in reasoner.xsd.

Serializes rules (a list of instances or their names), enclosing them in a ruleSet element, incorporating namespace declarations according to *namespaces* and *default-namespace*. format defaults to :xml.

Serializes a rule (well-formed-formula) instance.

XML Schema Built-in Datatypes

print-as-type methods are defined for the datatypes mentioned in the section Parsing of XML Schema Built-in Datatypes.

Used to decode a range bound representing a time. Initially decode-universal-time.

XML Deserialization

The deserialization operation utilizes an external XML parser.

Parsers can be categorized according to whether they produce a complete parse tree, or expose an interface (e.g., SAX) that enables parsing to be interleaved with subsequent processing.

Takes a complete parse tree and creates or reinitializes the equivalent classes or instances. Calls deserialize-as-object. Returns a set of assumptions.

namespaces is used to create a temporary binding for *namespaces*, which is used to find the Lisp package associated with a namespace; it defaults to all defined namespaces (see make-namespace). namespace, used to create a temporary binding for *default-namespace*, is added to the front of this list if non-nil. base is used to create a temporary binding for *target-namespace* and defaults to the value of namespace (see RDF and OWL Compatibility).

Takes a top-level element and creates or reinitializes the equivalent class or instance.

tag-fn, when applied to a node of the tree, should return a tag name, unqualified; attribute-fn should return an attribute value, given a node, a name and, if the name is qualified, the uri of the namespace; content-fn should return a list comprising either a single string, or subordinate nodes.

tag-fn defaults to caar (or car, if atomic); attribute-fn defaults to assoc applied to the cdar of the node; content-fn defaults to cdr.

Uses a low-level interface that is also compatible with the second category of parsers.

Captures assumptions created during deserialization. Rebound by deserialize-as-objects.

If assigned a single assumption, will be used in place of any number of assumptions that would otherwise be created.

Defines a set of XML-style, symbol or string, element names, belonging to namespace-name, whose presence should not affect the processing of surrounding elements in an XML document.

The full list of elements that are to be treated as no-ops.

If bound to nil (default non-nil), will eliminate the considerable overhead of maintaining a count of subelements, if there are many. See Cardinality.

The system-supplied method examines the above variable. May be specialized to exert finer-grained control.

Processing of XML Schema Subelements

A complexType definition is always treated as corresponding to a CLOS class definition.

Global element declarations and references to them are recognized.

The built-in datatypes boolean, integer, nonNegativeInteger, positiveInteger and string, as well as those mentioned in the section Parsing of XML Schema Built-in Datatypes, may be used in element declarations.

The all and choice group elements are recognized, but no processing is performed beyond that for sequence.

Named group elements are not recognized.

General Parsing Interface

Usually, an XML name is converted into a Lisp equivalent, with hyphens inserted at the points marked by a transition between lower and upper case. However, if readtable-case, when called with the current readtable, returns :preserve, the XML variant will be used exclusively.

Locates the symbol named name (a string) in the package associated with the namespace given by namespace, which can be either a uri (string) or a keyword symbol. If none is found, signals an error, unless errorp is nil. For use with element names. See defnamespace, define-ignored-elements.

Enters a symbol named name (a string) into the package associated with the namespace given by namespace, which can be either a uri (string) a keyword symbol, or nil. In the latter case, if the name is not already present in one of the members of *namespaces*, it is interned in the current package.

Called to parse non-whitespace content within an instance document, bounded by min and max. The default method calls parse-integer, and then, if nil is returned, intern-xml-name. See format-as-type.

Should be called when an element is first encountered.

Records the content of a leaf element.

Should be called when unwinding out of an element.

Used to derive from a schema element’s name, if it contains occurrence constraints, the name and type of an additional slot definition that is created to hold this information (see Cardinality). Initially :count and :numbers.

Parsing of XML Schema Built-in Datatypes

parse-as-type methods are defined for the datatypes enumerated below. All return a numeric range; in cases other than dateTime and time the lower and upper bounds will differ (may differ in the case of duration).

dateTime and its truncated variants (time, date, gYearMonth, gYear, gMonthDay, gDay, gMonth) are represented, viewed as sets of times, by their earliest and latest elements, and are converted by *date-time-encoding-fn*. Fractional seconds are ignored.

A duration is converted to a number of seconds. A fraction in the seconds component is ignored. Ambiguity arises from the varying lengths of months and years.

Initially encode-universal-time; will reject a negative-signed year. A substitute should accept the same arguments that it does.

Integration with Specific Parsers

Parse Tree Traversal

Takes as its input an element in the parse tree generated by the CLLIB XML Parser. See deserialize-as-objects.

SAX

Both the CXML and Allegro parsers are supported. If both are present, CXML takes precedence.

A class deserializer is defined (in package rs-sax), which can be used in conjunction with the CXML function parse.

For Allegro users, deserializer is defined in package net.xml.rs, and can be used with the functions sax-parse-file, sax-parse-stream and sax-parse-string.

Called if there is no namespace definition corresponding to (having the same uri as) a declaration in the document. The default behaviour is to call make-namespace.

Called if there is no namespace definition having the same uri as the default namespace declaration in the document. The default behaviour is to call make-namespace, passing it a fresh symbol.