<jjc@jclark.com>
, MURATA Makoto <EB2M-MRT@asahi-net.or.jp>
Copyright � The Organization for the Advancement of Structured Information Standards [OASIS] 2001. All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
This Committee Specification was approved for publication by the OASIS RELAX NG technical committee. It is a stable document which represents the consensus of the committee. Comments on this document may be sent to relax-ng-comment@lists.oasis-open.org.
A list of known errors was applied to this document http://www.oasis-open.org/committees/relax-ng/spec-20011203-errata.html.
datatypeLibrary
attributetype
attribute of value
elementhref
attributeexternalRef
elementinclude
elementname
attribute of element
and attribute
elementsns
attributediv
elementmixed
elementoptional
elementzeroOrMore
elementcombine
attributegrammar
elementdefine
and ref
elementsnotAllowed
elementempty
elementThis document specifies
An XML document that is being validated with respect to a RELAX NG schema is referred to as an instance.
The structure of this document is as follows. Section 2, “Data model” describes the data model, which is the abstraction of an XML document used throughout the rest of the document. Section 3, “Full syntax” describes the syntax of a RELAX NG schema; any correct RELAX NG schema must conform to this syntax. Section 4, “Simplification” describes a sequence of transformations that are applied to simplify a RELAX NG schema; applying the transformations also involves checking certain restrictions that must be satisfied by a correct RELAX NG schema. Section 5, “Simple syntax” describes the syntax that results from applying the transformations; this simple syntax is a subset of the full syntax. Section 6, “Semantics” describes the semantics of a correct RELAX NG schema that uses the simple syntax; the semantics specify when an element is valid with respect to a RELAX NG schema. Section 7, “Restrictions” describes restrictions in terms of the simple syntax; a correct RELAX NG schema must be such that, after transformation into the simple form, it satisfies these restrictions. Finally, Section 8, “Conformance” describes conformance requirements for RELAX NG validators.
A tutorial is available separately (see [Tutorial]).
RELAX NG deals with XML documents representing both schemas and instances through an abstract data model. XML documents representing schemas and instances must be well-formed in conformance with [XML 1.0] and must conform to the constraints of [XML Namespaces].
An XML document is represented by an element. An element consists of
A name consists of
A context consists of
An attribute consists of
A string consists of a sequence of zero or more characters, where a character is as defined in [XML 1.0].
The element for an XML document is constructed from an instance
of the [XML Infoset] as follows. We use the notation
[x
] to refer to the value of the
x
property of an information item. An
element is constructed from a document information item by
constructing an element from the [document element]. An element is
constructed from an element information item by constructing the name
from the [namespace name] and [local name], the context from the [base
URI] and [in-scope namespaces], the attributes from the [attributes],
and the children from the [children]. The attributes of an element
are constructed from the unordered set of attribute information items
by constructing an attribute for each attribute information item. The
children of an element are constructed from the list of child
information items first by removing information items other than
element information items and character information items, and then by
constructing an element for each element information item in the list
and a string for each maximal sequence of character information items.
An attribute is constructed from an attribute information item by
constructing the name from the [namespace name] and [local name], and
the value from the [normalized value]. When constructing the name of
an element or attribute from the [namespace name] and [local name], if
the [namespace name] property is not present, then the name is
constructed from an empty string and the [local name]. A string is
constructed from a sequence of character information items by
constructing a character from the [character code] of each character
information item.
It is possible for there to be multiple distinct infosets for a single XML document. This is because XML parsers are not required to process all DTD declarations or expand all external parsed general entities. Amongst these multiple infosets, there is exactly one infoset for which [all declarations processed] is true and which does not contain any unexpanded entity reference information items. This is the infoset that is the basis for defining the RELAX NG data model.
Suppose the document
http://www.example.com/doc.xml
is as
follows:
<?xml version="1.0"?> <foo><pre1:bar1 xmlns:pre1="http://www.example.com/n1"/><pre2:bar2 xmlns:pre2="http://www.example.com/n2"/></foo>
The element representing this document has
The following grammar summarizes the syntax of RELAX NG.
Although we use a notation based on the XML representation of an RELAX
NG schema as a sequence of characters, the grammar must be understood
as operating at the data model level. For example, although the
syntax uses <text/>
, an instance or
schema can use <text></text>
instead,
because they both represent the same element at the data model level.
All elements shown in the grammar are qualified with the namespace
URI:
http://relaxng.org/ns/structure/1.0
The symbols QName and NCName are defined in [XML Namespaces]. The anyURI symbol has the same meaning as the anyURI datatype of [W3C XML Schema Datatypes]: it indicates a string that, after escaping of disallowed values as described in Section 5.4 of [XLink], is a URI reference as defined in [RFC 2396] (as modified by [RFC 2732]). The symbol string matches any string.
In addition to the attributes shown explicitly, any element can
have an ns
attribute and any element can have a
datatypeLibrary
attribute. The
ns
attribute can have any value. The value of the
datatypeLibrary
attribute must match the anyURI
symbol as described in the previous paragraph; in addition, it must
not use the relative form of URI reference and must not have a
fragment identifier; as an exception to this, the value may be the
empty string.
Any element can also have foreign attributes in addition to the
attributes shown in the grammar. A foreign attribute is an attribute
with a name whose namespace URI is neither the empty string nor the
RELAX NG namespace URI. Any element that cannot have string children
(that is, any element other than value
, param
and name
) may have foreign child elements in addition
to the child elements shown in the grammar. A foreign element is an
element with a name whose namespace URI is not the RELAX NG namespace
URI. There are no constraints on the relative position of foreign
child elements with respect to other child elements.
Any element can also have as children strings that consist entirely of whitespace characters, where a whitespace character is one of #x20, #x9, #xD or #xA. There are no constraints on the relative position of whitespace string children with respect to child elements.
Leading and trailing whitespace is allowed for value of each
name
, type
and
combine
attribute and for the content of each
name
element.
pattern | ::= | <element name=" QName" > pattern+ </element> | <element > nameClass pattern+ </element> | <attribute name=" QName" > [pattern] </attribute> | <attribute > nameClass [pattern] </attribute> | <group > pattern+ </group> | <interleave > pattern+ </interleave> | <choice > pattern+ </choice> | <optional > pattern+ </optional> | <zeroOrMore > pattern+ </zeroOrMore> | <oneOrMore > pattern+ </oneOrMore> | <list > pattern+ </list> | <mixed > pattern+ </mixed> | <ref name=" NCName" /> | <parentRef name=" NCName" /> | <empty /> | <text /> | <value [type=" NCName" ]> string </value> | <data type=" NCName" > param* [exceptPattern] </data> | <notAllowed /> | <externalRef href=" anyURI" /> | <grammar > grammarContent* </grammar> |
param | ::= | <param name=" NCName" > string </param> |
exceptPattern | ::= | <except > pattern+ </except> |
grammarContent | ::= | start | define | <div > grammarContent* </div> | <include href=" anyURI" > includeContent* </include> |
includeContent | ::= | start | define | <div > includeContent* </div> |
start | ::= | <start [combine=" method" ]> pattern </start> |
define | ::= | <define name=" NCName" [combine=" method" ]> pattern+ </define> |
method | ::= | choice | interleave |
nameClass | ::= | <name > QName </name> | <anyName > [exceptNameClass] </anyName> | <nsName > [exceptNameClass] </nsName> | <choice > nameClass+ </choice> |
exceptNameClass | ::= | <except > nameClass+ </except> |
Here is an example of a schema in the full syntax for the document in Section 2.1, “Example”.
<?xml version="1.0"?> <element name="foo" xmlns="http://relaxng.org/ns/structure/1.0" xmlns:a="http://relaxng.org/ns/annotation/1.0" xmlns:ex1="http://www.example.com/n1" xmlns:ex2="http://www.example.com/n2"> <a:documentation>A foo element.</a:documentation> <element name="ex1:bar1"> <empty/> </element> <element name="ex2:bar2"> <empty/> </element> </element>
The full syntax given in the previous section is transformed into a simpler syntax by applying the following transformation rules in order. The effect must be as if each rule was applied to all elements in the schema before the next rule is applied. A transformation rule may also specify constraints that must be satisfied by a correct schema. The transformation rules are applied at the data model level. Before the transformations are applied, the schema is parsed into an instance of the data model.
Foreign attributes and elements are removed.
It is safe to remove xml:base
attributes at this stage because xml:base
attributes are used in determining the [base URI] of an element
information item, which is in turn used to construct the base URI of
the context of an element. Thus, after a document has been parsed
into an instance of the data model, xml:base
attributes can be discarded.
For each element other than value
and
param
, each child that is a string containing only
whitespace characters is removed.
Leading and trailing whitespace characters are removed from the
value of each name
, type
and
combine
attribute and from the content of each
name
element.
The value of each datatypeLibary
attribute is
transformed by escaping disallowed characters as specified in Section
5.4 of [XLink].
For any data
or value
element that does not have a datatypeLibrary
attribute, a datatypeLibrary
attribute is
added. The value of the added datatypeLibrary
attribute is the value of the datatypeLibrary
attribute of the nearest ancestor element that has a
datatypeLibrary
attribute, or the empty string if
there is no such ancestor. Then, any datatypeLibrary
attribute that is on an element other than data
or
value
is removed.
For any value
element that does not have a
type
attribute, a type
attribute
is added with value token
and the value of the
datatypeLibrary
attribute is changed to the empty
string.
The value of the href
attribute on an
externalRef
or include
element
is first transformed by escaping disallowed characters as specified in
Section 5.4 of [XLink]. The URI reference is then
resolved into an absolute form as described in section 5.2 of [RFC 2396] using the base URI from the context of the element
that bears the href
attribute.
The value of the href
attribute will be used
to construct an element (as specified in Section 2, “Data model”). This must be done as follows. The URI
reference consists of the URI itself and an optional fragment
identifier. The resource identified by the URI is retrieved. The
result is a MIME entity: a sequence of bytes labeled with a MIME
media type. The media type determines how an element is constructed
from the MIME entity and optional fragment identifier. When the media
type is application/xml
or
text/xml
, the MIME entity must be parsed as an XML
document in accordance with the applicable RFC (at the term of writing
[RFC 3023]) and an element constructed from the result
of the parse as specified in Section 2, “Data model”. In
particular, the charset
parameter must be handled
as specified by the RFC. This specification does not define the
handling of media types other than application/xml
and text/xml
. The href
attribute
must not include a fragment identifier unless the registration of the
media type of the resource identified by the attribute defines the
interpretation of fragment identifiers for that media type.
[RFC 3023] does not define the
interpretation of fragment identifiers for
application/xml
or
text/xml
.
An externalRef
element is transformed as
follows. An element is constructed using the URI reference that is
the value of href
attribute as specified in Section 4.5, “href
attribute”. This element must match the syntax for pattern. The
element is transformed by recursively applying the rules from this
subsection and from previous subsections of this section. This must
not result in a loop. In other words, the transformation of the
referenced element must not require the dereferencing of an
externalRef
element with an
href
attribute with the same value.
Any ns
attribute on the
externalRef
element is transferred to the
referenced element if the referenced element does not already have an
ns
attribute. The externalRef
element is then replaced by the referenced element.
An include
element is transformed as follows.
An element is constructed using the URI reference that is the value of
href
attribute as specified in Section 4.5, “href
attribute”. This element must be a grammar
element, matching the syntax for grammar.
This grammar
element is transformed by
recursively applying the rules from this subsection and from previous
subsections of this section. This must not result in a loop. In other
words, the transformation of the grammar
element
must not require the dereferencing of an include
element with an href
attribute with the same
value.
Define the components of an element to
be the children of the element together with the components of any
div
child elements. If the
include
element has a start
component, then the grammar
element must have a
start
component. If the include
element has a start
component, then all
start
components are removed from the
grammar
element. If the include
element has a define
component, then the
grammar
element must have a
define
component with the same name. For every
define
component of the include
element, all define
components with the same name
are removed from the grammar
element.
The include
element is transformed into a
div
element. The attributes of the
div
element are the attributes of the
include
element other than the
href
attribute. The children of the
div
element are the grammar
element (after the removal of the start
and
define
components described by the preceding
paragraph) followed by the children of the include
element. The grammar
element is then renamed to
div
.
The name
attribute on an
element
or attribute
element is
transformed into a name
child element.
If an attribute
element has a
name
attribute but no ns
attribute, then an ns=""
attribute is added to the
name
child element.
For any name
, nsName
or
value
element that does not have an
ns
attribute, an ns
attribute is
added. The value of the added ns
attribute is the
value of the ns
attribute of the nearest ancestor
element that has an ns
attribute, or the empty
string if there is no such ancestor. Then, any ns
attribute that is on an element other than name
,
nsName
or value
is
removed.
The value of the ns
attribute is
not transformed either by escaping
disallowed characters, or in any other way, because the value of the
ns
attribute is compared against namespace URIs in
the instance, which are not subject to any
transformation.
Since include
and
externalRef
elements are resolved after
datatypeLibrary
attributes are added but before
ns
attributes are added, ns
attributes are inherited into external schemas but
datatypeLibrary
attributes are not.
For any name
element containing a prefix, the
prefix is removed and an ns
attribute is added
replacing any existing ns
attribute. The value of
the added ns
attribute is the value to which the
namespace map of the context of the name
element
maps the prefix. The context must have a mapping for the
prefix.
A define
, oneOrMore
,
zeroOrMore
, optional
, list
or
mixed
element is transformed so that it has exactly
one child element. If it has more than one child element, then its
child elements are wrapped in a group
element. Similarly, an element
element is transformed so
that it has exactly two child elements, the first being a name class
and the second being a pattern. If it has more than two child elements,
then the child elements other than the first are wrapped in a
group
element.
A except
element is transformed
so that it has exactly one child element. If it has more
than one child element, then its child elements are wrapped
in a choice
element.
If an attribute
element has only one child
element (a name class), then a text
element is
added.
A choice
, group
or
interleave
element is transformed so that it has
exactly two child elements. If it has one child element, then it is
replaced by its child element. If it has more than two child
elements, then the first two child elements are combined into a new
element with the same name as the parent element and with the first
two child elements as its children. For example,
<choice>p1
p2
p3
</choice>
is transformed to
<choice> <choice>p1
p2
</choice>p3
</choice>
This reduces the number of child elements by one. The transformation is applied repeatedly until there are exactly two child elements.
A mixed
element is transformed into an
interleaving with a text
element:
<mixed> p
</mixed>
is transformed into
<interleave> p
<text/> </interleave>
An optional
element is transformed into
a choice
element with one child being the child of the optional
element and the other child being empty
:
<optional> p
</optional>
is transformed into
<choice> p
<empty/> </choice>
A zeroOrMore
element is transformed into a choice
element with one child being an <oneOrMore>
p
</oneOrMore>
element and the other child being an empty
element, where p
is the child of the zeroOrMore
element:
<zeroOrMore> p
</zeroOrMore>
is transformed into
<choice> <oneOrMore> p
</oneOrMore> <empty/> </choice>
In this rule, no transformation is performed, but various constraints are checked.
The constraints in this section, unlike the constraints
specified in Section 7, “Restrictions”, can be checked without
resolving any ref
elements, and are accordingly
applied even to patterns that will disappear during later stages of
simplification because they are not reachable (see Section 4.19, “define
and ref
elements”) or because of notAllowed
(see Section 4.20, “notAllowed
element”).
An except
element that is a child of an
anyName
element must not have any
anyName
descendant elements. An
except
element that is a child of an
nsName
element must not have any
nsName
or anyName
descendant
elements.
A name
element that occurs as the first child
of an attribute
element or as the descendant of the
first child of an attribute
element and that has an
ns
attribute with value equal to the empty string
must not have content equal to xmlns
.
A name
or nsName
element
that occurs as the first child of an attribute
element or as the descendant of the first child of an
attribute
element must not have an
ns
attribute with value
http://www.w3.org/2000/xmlns
.
The [XML Infoset] defines the namespace URI of
namespace declaration attributes to be
http://www.w3.org/2000/xmlns
.
A data
or value
element
must be correct in its use of datatypes. Specifically, the
type
attribute must identify a datatype within the
datatype library identified by the value of the
datatypeLibrary
attribute. For a
data
element, the parameter list must be one that
is allowed by the datatype (see Section 6.2.8, “data
and value
pattern”).
For each grammar
element, all
define
elements with the same name are combined
together. For any name, there must not be more than one
define
element with that name that does not have a
combine
attribute. For any name, if there is a
define
element with that name that has a
combine
attribute with the value
choice
, then there must not also be a
define
element with that name that has a
combine
attribute with the value
interleave
. Thus, for any name, if there is more
than one define
element with that name, then there
is a unique value for the combine
attribute for
that name. After determining this unique value, the
combine
attributes are removed. A pair of
definitions
<define name="n
">p1
</define> <define name="n
">p2
</define>
is combined into
<define name="n
"> <c
>p1
p2
</c
> </define>
where c
is the value of the
combine
attribute. Pairs of definitions are
combined until there is exactly one define
element
for each name.
Similarly, for each grammar
element all
start
elements are combined together. There must
not be more than one start
element that does not
have a combine
attribute. If there is a
start
element that has a combine
attribute with the value choice
, there must not
also be a start
element that has a
combine
attribute with the value
interleave
.
In this rule, the schema is transformed so that its top-level
element is grammar
and so that it has no other
grammar
elements.
Define the in-scope grammar for an
element to be the nearest ancestor grammar
element. A
ref
element refers to a
define
element if the value of their
name
attributes is the same and their in-scope
grammars are the same. A parentRef
element
refers to a define
element
if the value of their name
attributes is the same
and the in-scope grammar of the in-scope grammar of the
parentRef
element is the same as the in-scope
grammar of the define
element. Every
ref
or parentRef
element must
refer to a define
element. A
grammar
must have a start
child
element.
First, transform the top-level pattern
p
into
<grammar><start>
.
Next, rename p
</start></grammar>define
elements so that no two
define
elements anywhere in the schema have the
same name. To rename a define
element, change the
value of its name
attribute and change the value of
the name
attribute of all ref
and parentRef
elements that refer to that
define
element. Next, move all
define
elements to be children of the top-level
grammar
element, replace each nested
grammar
element by the child of its
start
element and rename each
parentRef
element to ref
.
In this rule, the grammar is transformed so that every
element
element is the child of a
define
element, and the child of every
define
element is an element
element.
First, remove any define
element that is not
reachable. A define
element
is reachable if there is reachable ref
element
referring to it. A ref
element is reachable if it
is the descendant of the start
element or of a
reachable define
element. Now, for
each element
element that is not the child of a
define
element, add a define
element to the grammar
element, and replace the
element
element by a ref
element
referring to the added define
element. The value of
the name
attribute of the added
define
element must be different from value of the
name
attribute of all other
define
elements. The child of the added
define
element is the element
element.
Define a ref
element to be
expandable if it refers to a
define
element whose child is not an
element
element. For each ref
element that is expandable and is a descendant of a
start
element or an element
element, expand it by replacing the ref
element by
the child of the define
element to which it refers and
then recursively expanding any expandable ref
elements in this replacement. This must not result in a loop.
In other words expanding the replacement of a
ref
element having a name
with
value n
must not require the expansion of
ref
element also having a name
with value n
. Finally, remove any
define
element whose child is not an
element
element.
In this rule, the grammar is transformed so that a
notAllowed
element occurs only as the child of
a start
or element
element. An
attribute
, list
,
group
, interleave
,
or oneOrMore
element that has a
notAllowed
child element is transformed into a
notAllowed
element. A choice
element that has two notAllowed
child elements is
transformed into a notAllowed
element. A
choice
element that has one
notAllowed
child element is transformed into its
other child element. An except
element that has a
notAllowed
child element is removed.
The preceding transformations are applied
repeatedly until none of them is applicable any more.
Any define
element that is no longer reachable
is removed.
In this rule, the grammar is transformed so that an
empty
element does not occur as a child of a
group
, interleave
, or
oneOrMore
element or as the second child of
a choice
element. A group
,
interleave
or choice
element
that has two empty
child elements is transformed
into an empty
element. A group
or interleave
element that has one
empty
child element is transformed into its other
child element. A choice
element whose
second child element is an empty
element is
transformed by interchanging its two child elements. A
oneOrMore
element that has an
empty
child element is transformed into an
empty
element. The preceding transformations are applied
repeatedly until none of them is applicable any more.
After applying all the rules in Section 4, “Simplification”, the schema will match the following grammar:
grammar | ::= | <grammar > <start > top </start> define* </grammar> |
define | ::= | <define name=" NCName" > <element > nameClass top </element> </define> |
top | ::= | <notAllowed /> | pattern |
pattern | ::= | <empty /> | nonEmptyPattern |
nonEmptyPattern | ::= | <text /> | <data type=" NCName" datatypeLibrary=" anyURI" > param* [exceptPattern] </data> | <value datatypeLibrary=" anyURI" type=" NCName" ns=" string" > string </value> | <list > pattern </list> | <attribute > nameClass pattern </attribute> | <ref name=" NCName" /> | <oneOrMore > nonEmptyPattern </oneOrMore> | <choice > pattern nonEmptyPattern </choice> | <group > nonEmptyPattern nonEmptyPattern </group> | <interleave > nonEmptyPattern nonEmptyPattern </interleave> |
param | ::= | <param name=" NCName" > string </param> |
exceptPattern | ::= | <except > pattern </except> |
nameClass | ::= | <anyName > [exceptNameClass] </anyName> | <nsName ns=" string" > [exceptNameClass] </nsName> | <name ns=" string" > NCName </name> | <choice > nameClass nameClass </choice> |
exceptNameClass | ::= | <except > nameClass </except> |
With this grammar, no elements or attributes are allowed other than those explicitly shown.
The following is an example of how the schema in Section 3.1, “Example” can be transformed into the simple syntax:
<?xml version="1.0"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <start> <ref name="foo.element"/> </start> <define name="foo.element"> <element> <name ns="">foo</name> <group> <ref name="bar1.element"/> <ref name="bar2.element"/> </group> </element> </define> <define name="bar1.element"> <element> <name ns="http://www.example.com/n1">bar1</name> <empty/> </element> </define> <define name="bar2.element"> <element> <name ns="http://www.example.com/n2">bar2</name> <empty/> </element> </define> </grammar>
Strictly speaking, the result of simplification is an instance of the data model rather than an XML document. For convenience, we use an XML document to represent an instance of the data model.
In this section, we define the semantics of a correct RELAX NG schema that has been transformed into the simple syntax. The semantics of a RELAX NG schema consist of a specification of what XML documents are valid with respect to that schema. The semantics are described formally. The formalism uses axioms and inference rules. Axioms are propositions that are provable unconditionally. An inference rule consists of one or more antecedents and exactly one consequent. An antecedent is either positive or negative. If all the positive antecedents of an inference rule are provable and none of the negative antecedents are provable, then the consequent of the inference rule is provable. An XML document is valid with respect to a RELAX NG schema if and only if the proposition that it is valid is provable in the formalism specified in this section.
This kind of formalism is similar to a proof system. However, a traditional proof system only has positive antecedents.
The notation for inference rules separates the antecedents from
the consequent by a horizontal line: the antecedents are above the
line; the consequent is below the line. If an antecedent is of the
form not(p
), then it is a negative
antecedent; otherwise, it is a positive antecedent. Both axioms and
inferences
rules may use variables. A variable has a name and optionally a
subscript. The name of a variable is italicized. Each variable has a
range that is determined by its name. Axioms and inference rules are
implicitly universally quantified over the variables they contain. We
explain this further below.
The possibility that an inference rule or axiom may contain more than one occurrence of a particular variable requires that an identity relation be defined on each kind of object over which a variable can range. The identity relation for all kinds of object is value-based. Two objects of a particular kind are identical if the constituents of the objects are identical. For example, two attributes are considered the same if they have the same name and the same value. Two characters are identical if their Unicode character codes are the same.
The main semantic concept for name classes is that of a name belonging to a name class. A name class is an element that matches the production nameClass. A name is as defined in Section 2, “Data model”: it consists of a namespace URI and a local name.
We use the following notation:
We are now ready for our first axiom, which is called "anyName 1":
(anyName 1) |
|
This says for any name n, n belongs to the name class <anyName
/>
,
in other words <anyName
/>
matches any name. Note the
effect of the implicit universal quantification over the variables in
the axiom: this is what makes the axiom apply for any name n.
Our first inference rule is almost as simple:
(anyName 2) |
|
This says that for any name n
and for any name class nc,
if n does not belong to nc,
then n belongs to
<anyName
>
<except
>
nc </except>
</anyName>
. In other words, <anyName
>
<except
>
nc </except>
</anyName>
matches any name that does not match nc.
We now need the following additional notation:
The remaining axioms and inference rules for name classes are as follows:
(nsName 1) |
|
(nsName 2) |
|
(name) |
|
(name choice 1) |
|
(name choice 2) |
|
The axioms and inference rules for patterns use the following notation:
The semantics of the choice
pattern are as follows:
(choice 1) |
|
(choice 2) |
|
We use the following additional notation:
The semantics of the group
pattern are as follows:
(group) |
|
The restriction in Section 7.3, “Restrictions on attributes” ensures that the set of attributes constructed in the consequent will not have multiple attributes with the same name.
We use the following additional notation:
The semantics of the empty
pattern are as follows:
(empty) |
|
We use the following additional notation:
The semantics of the text
pattern are as follows:
(text 1) |
|
(text 2) |
|
The effect of the above rule is that a text
element matches zero or more strings.
We use the following additional notation:
The semantics of the oneOrMore
pattern are as follows:
(oneOrMore 1) |
|
(oneOrMore 2) |
|
We use the following additional notation:
The semantics of interleaving are defined by the following rules.
(interleaves 1) |
|
(interleaves 2) |
|
(interleaves 3) |
|
For example, the interleavings of
<a/><a/>
and
<b/>
are
<a/><a/><b/>
,
<a/><b/><a/>
, and
<b/><a/><a/>
.
The semantics of the interleave
pattern are
as follows:
(interleave) |
|
The restriction in Section 7.3, “Restrictions on attributes” ensures that the set of attributes constructed in the consequent will not have multiple attributes with the same name.
The value of an attribute is always a single string, which may be empty. Thus, the empty sequence is not a possible attribute value. On the hand, the children of an element can be an empty sequence and cannot consist of an empty string. In order to ensure that validation handles attributes and elements consistently, we introduce a variant of matching called weak matching. Weak matching is used when matching the pattern for the value of an attribute or for the attributes and children of an element. We use the following notation to define weak matching.
The semantics of weak matching are as follows:
(weak match 1) |
|
(weak match 2) |
|
(weak match 3) |
|
We use the following additional notation:
<element>
nc p </element>
<define
name="
ln"
>
<element
>
nc p </element>
</define>
The semantics of the attribute
pattern are as follows:
(attribute) |
|
The semantics of the element
pattern are as follows:
(element) |
|
RELAX NG relies on datatype libraries to perform datatyping. A datatype library is identified by a URI. A datatype within a datatype library is identified by an NCName. A datatype library provides two services.
Both services may make use of the context of a string. For example, a datatype representing a QName would use the namespace map.
We use the following additional notation:
The datatypeEqual function must be reflexive, transitive and symmetric, that is, the following inference rules must hold:
(datatypeEqual reflexive) |
|
(datatypeEqual transitive) |
|
(datatypeEqual symmetric) |
|
The semantics of the data
and
value
patterns are as follows:
(value) |
|
(data 1) |
|
(data 2) |
|
The empty URI identifies a special built-in datatype library.
This provides two datatypes, string
and
token
. No parameters are allowed for either of
these datatypes.
The semantics of the two built-in datatypes are as follows:
(string allows) |
|
(string equal) |
|
(token allows) |
|
(token equal) |
|
We use the following additional notation:
The semantics of the list
pattern are as follows:
(list) |
|
It is crucial in the above inference rule that the sequence that is matched against a pattern can contain consecutive strings.
Now we can define when an element is valid with respect to a schema. We use the following additional notation:
An element is valid if together with an empty set of attributes
it matches the start
pattern of the grammar.
(valid) |
|
Let e0 be
""
, "foo"
), cx0, { }, m )
where m is
and e1 is
"http://www.example.com/n1"
, "bar1"
), cx1, { }, ( ) )
and e2 is
"http://www.example.com/n2"
, "bar2"
), cx2, { }, ( ) )
Assuming appropriate definitions of cx0, cx1 and cx2, this represents the document in Section 2.1, “Example”.
We now show how e0 can be shown to be valid with respect to the schema in Section 5.1, “Example”. The schema is equivalent to the following propositions:
<ref
name="
foo
"
/>
"foo.element"
) = <element>
<name
ns="
"
>
"foo"
</name>
<group
>
<ref
name="
bar1
"
/>
<ref
name="
bar2
"
/>
</group>
</element>
"bar1.element"
) = <element>
<name
ns="
http://www.example.com/n1
"
>
"bar1"
</name>
<empty
/>
</element>
"bar2.element"
) = <element>
<name
ns="
http://www.example.com/n2
"
>
"bar2"
</name>
<empty
/>
</element>
Let name class nc1 be
<name
ns="
http://www.example.com/n1
"
>
"bar1"
</name>
and let nc2 be
<name
ns="
http://www.example.com/n2
"
>
"bar2"
</name>
Then, by the inference rule (name) in Section 6.1, “Name classes”, we have
"http://www.example.com/n1"
, "bar1"
) in nc1
and
"http://www.example.com/n2"
, "bar2"
) in nc2
By the inference rule (empty) in Section 6.2.3, “empty
pattern”,
we have
<empty
/>
and
<empty
/>
Thus by the inference rule (element) in Section 6.2.7, “element
and attribute
pattern”, we have
<ref
name="
bar1
"
/>
Note that we have chosen cx0, since any context is allowed.
Likewise, we have
<ref
name="
bar2
"
/>
By the inference rule (group) in Section 6.2.1, “choice
pattern”, we have
<group
>
<ref
name="
bar1
"
/>
<ref
name="
bar2
"
/>
</group>
By the inference rule (element) in Section 6.2.7, “element
and attribute
pattern”, we have
""
, "foo"
), cx0, { }, m ) =~ <ref
name="
foo
"
/>
Here cx3 is an arbitrary context.
Thus we can apply the inference rule (valid) in Section 6.3, “Validity” and obtain
The following constraints are all checked after the grammar has been transformed to the simple form described in Section 5, “Simple syntax”. The purpose of these restrictions is to catch user errors and to facilitate implementation.
In this section we describe restrictions on where elements are
allowed in the schema based on the names of the ancestor elements. We
use the concept of a prohibited path to
describe these restrictions. A path is a sequence of NCNames separated
by /
or //
.
x
, where x
is an
NCName, if and only if the local name of the element is
x
x
/
p
,
where x
is an NCName and
p
is a path, if and only if the local name
of the element is x
and the element has a
child that matches p
x
//
p
,
where x
is an NCName and
p
is a path, if and only if the local name
of the element is x
and the element has a
descendant that matches p
For example, the element
<foo> <bar> <baz/> </bar> </foo>
matches the paths foo
,
foo/bar
, foo//bar
,
foo//baz
, foo/bar/baz
,
foo/bar//baz
and foo//bar/baz
,
but not foo/baz
or
foobar
.
A correct RELAX NG schema must be such that, after transformation to the simple form, it does not contain any element that matches a prohibited path.
The following paths are prohibited:
The following paths are prohibited:
The following paths are prohibited:
This implies that an except
element
with a data
parent can contain only
data
, value
and
choice
elements.
RELAX NG does not allow a pattern such as:
<element name="foo"> <group> <data type="int"/> <element name="bar"> <empty/> </element> </group> </element>
Nor does it allow a pattern such as:
<element name="foo"> <group> <data type="int"/> <text/> </group> </element>
More generally, if the pattern for the content of an element or attribute contains
then the two patterns must be alternatives to each other.
This rule does not apply to patterns occurring within a
list
pattern.
To formalize this, we use the concept of a content-type. A pattern that is allowable as the content of an element has one of three content-types: empty, complex and simple. We use the following notation.
The empty content-type is groupable with anything. In addition, the complex content-type is groupable with the complex content-type. The following rules formalize this.
(group empty 1) |
|
(group empty 2) |
|
(group complex) |
|
Some patterns have a content-type. We use the following additional notation.
The following rules define when a pattern has a content-type and, if so, what it is.
(value) |
|
(data 1) |
|
(data 2) |
|
(list) |
|
(text) |
|
(ref) |
|
(empty) |
|
(attribute) |
|
(group) |
|
(interleave) |
|
(oneOrMore) |
|
(choice) |
|
The antecedent in the (data 2) rule above is in fact
redundant because of the prohibited paths in Section 7.1.4, “except
in data
pattern”.
Now we can describe the restriction. We use the following notation.
All patterns occurring as the content of an element pattern must have a content-type.
(element) |
|
Duplicate attributes are not allowed. More precisely, for a
pattern <group>
or
p1
p2
</group><interleave>
, there must
not be a name that belongs to both the name class of an
p1
p2
</interleave>attribute
pattern occurring in
p1
and the name class of an
attribute
pattern occurring in
p2
. A pattern p1
is defined to occur in a pattern
p2
if
Attributes using infinite name classes must be repeated. More
precisely, an attribute
element that has an
anyName
or nsName
descendant
element must have a oneOrMore
ancestor
element.
This restriction is necessary for closure under negation.
For a pattern <interleave>
,p1
p2
</interleave>
Section 7.3, “Restrictions on attributes” defines when one pattern is considered to occur in another pattern.
A conforming RELAX NG validator must be able to determine for any XML document whether it is a correct RELAX NG schema. A conforming RELAX NG validator must be able to determine for any XML document and for any correct RELAX NG schema whether the document is valid with respect to the schema.
However, the requirements in the preceding paragraph do not
apply if the schema uses a datatype library that the validator does
not support. A conforming RELAX NG validator is only required to
support the built-in datatype library described in Section 6.2.9, “Built-in datatype library”. A validator that claims conformance to
RELAX NG should document which datatype libraries it supports. The
requirements in the preceding paragraph also do not apply if the
schema includes externalRef
or
include
elements and the validator is unable to
retrieve the resource identified by the URI or is unable to construct
an element from the retrieved resource. A validator that claims
conformance to RELAX NG should document its capabilities for handling
URI references.
<grammar datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" ns="http://relaxng.org/ns/structure/1.0" xmlns="http://relaxng.org/ns/structure/1.0"> <start> <ref name="pattern"/> </start> <define name="pattern"> <choice> <element name="element"> <choice> <attribute name="name"> <data type="QName"/> </attribute> <ref name="open-name-class"/> </choice> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="attribute"> <ref name="common-atts"/> <choice> <attribute name="name"> <data type="QName"/> </attribute> <ref name="open-name-class"/> </choice> <interleave> <ref name="other"/> <optional> <ref name="pattern"/> </optional> </interleave> </element> <element name="group"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="interleave"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="choice"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="optional"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="zeroOrMore"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="oneOrMore"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="list"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="mixed"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> <element name="ref"> <attribute name="name"> <data type="NCName"/> </attribute> <ref name="common-atts"/> <ref name="other"/> </element> <element name="parentRef"> <attribute name="name"> <data type="NCName"/> </attribute> <ref name="common-atts"/> <ref name="other"/> </element> <element name="empty"> <ref name="common-atts"/> <ref name="other"/> </element> <element name="text"> <ref name="common-atts"/> <ref name="other"/> </element> <element name="value"> <optional> <attribute name="type"> <data type="NCName"/> </attribute> </optional> <ref name="common-atts"/> <text/> </element> <element name="data"> <attribute name="type"> <data type="NCName"/> </attribute> <ref name="common-atts"/> <interleave> <ref name="other"/> <group> <zeroOrMore> <element name="param"> <attribute name="name"> <data type="NCName"/> </attribute> <ref name="common-atts"/> <text/> </element> </zeroOrMore> <optional> <element name="except"> <ref name="common-atts"/> <ref name="open-patterns"/> </element> </optional> </group> </interleave> </element> <element name="notAllowed"> <ref name="common-atts"/> <ref name="other"/> </element> <element name="externalRef"> <attribute name="href"> <data type="anyURI"/> </attribute> <ref name="common-atts"/> <ref name="other"/> </element> <element name="grammar"> <ref name="common-atts"/> <ref name="grammar-content"/> </element> </choice> </define> <define name="grammar-content"> <interleave> <ref name="other"/> <zeroOrMore> <choice> <ref name="start-element"/> <ref name="define-element"/> <element name="div"> <ref name="common-atts"/> <ref name="grammar-content"/> </element> <element name="include"> <attribute name="href"> <data type="anyURI"/> </attribute> <ref name="common-atts"/> <ref name="include-content"/> </element> </choice> </zeroOrMore> </interleave> </define> <define name="include-content"> <interleave> <ref name="other"/> <zeroOrMore> <choice> <ref name="start-element"/> <ref name="define-element"/> <element name="div"> <ref name="common-atts"/> <ref name="include-content"/> </element> </choice> </zeroOrMore> </interleave> </define> <define name="start-element"> <element name="start"> <ref name="combine-att"/> <ref name="common-atts"/> <ref name="open-pattern"/> </element> </define> <define name="define-element"> <element name="define"> <attribute name="name"> <data type="NCName"/> </attribute> <ref name="combine-att"/> <ref name="common-atts"/> <ref name="open-patterns"/> </element> </define> <define name="combine-att"> <optional> <attribute name="combine"> <choice> <value>choice</value> <value>interleave</value> </choice> </attribute> </optional> </define> <define name="open-patterns"> <interleave> <ref name="other"/> <oneOrMore> <ref name="pattern"/> </oneOrMore> </interleave> </define> <define name="open-pattern"> <interleave> <ref name="other"/> <ref name="pattern"/> </interleave> </define> <define name="name-class"> <choice> <element name="name"> <ref name="common-atts"/> <data type="QName"/> </element> <element name="anyName"> <ref name="common-atts"/> <ref name="except-name-class"/> </element> <element name="nsName"> <ref name="common-atts"/> <ref name="except-name-class"/> </element> <element name="choice"> <ref name="common-atts"/> <ref name="open-name-classes"/> </element> </choice> </define> <define name="except-name-class"> <interleave> <ref name="other"/> <optional> <element name="except"> <ref name="open-name-classes"/> </element> </optional> </interleave> </define> <define name="open-name-classes"> <interleave> <ref name="other"/> <oneOrMore> <ref name="name-class"/> </oneOrMore> </interleave> </define> <define name="open-name-class"> <interleave> <ref name="other"/> <ref name="name-class"/> </interleave> </define> <define name="common-atts"> <optional> <attribute name="ns"/> </optional> <optional> <attribute name="datatypeLibrary"> <data type="anyURI"/> </attribute> </optional> <zeroOrMore> <attribute> <anyName> <except> <nsName/> <nsName ns=""/> </except> </anyName> </attribute> </zeroOrMore> </define> <define name="other"> <zeroOrMore> <element> <anyName> <except> <nsName/> </except> </anyName> <zeroOrMore> <choice> <attribute> <anyName/> </attribute> <text/> <ref name="any"/> </choice> </zeroOrMore> </element> </zeroOrMore> </define> <define name="any"> <element> <anyName/> <zeroOrMore> <choice> <attribute> <anyName/> </attribute> <text/> <ref name="any"/> </choice> </zeroOrMore> </element> </define> </grammar>
The changes in this version relative to version 0.9 are as follows:
0.9
has been
changed to 1.0
data/except//empty
has been added
as a prohibited path (see Section 7.1.4, “except
in data
pattern”)start//empty
has been added
as a prohibited path (see Section 7.1.5, “start
element”)list
element with more than one child element is
transformednotAllowed
element” now specifies how a
notAllowed
element occurring in an
except
element is transformedns
and datatypeLibrary
attributes, an empty string is allowed (see Section 3, “Full syntax”)define
and ref
elements” is now correctly specifiednotAllowed
element” now specifies that
define
elements that are no longer reachable are
removedexcept
in name
classes that are now specified in the newly added section were
previously specified in a subsection of Section 7.1, “Contextual restrictions”, which has been
removedelement
and attribute
pattern” and Section 6.2.8, “data
and value
pattern”)interleave
(see Section 7.4, “Restrictions on interleave
”); list//interleave
has been added as a prohibited path (see Section 7.1.3, “list
pattern”)ref
rather than
element
text
pattern” has been correctedns
attribute is
now unconstrained (see Section 3, “Full syntax”)This specification was prepared and approved for publication by the RELAX NG TC. The current members of the TC are: