4. Validating XML Documents

4.1. The Validator Class

The Validator class encapsulates XMLUnit's validation support. It will use the SAXParser configured in XMLUnit (see Section 2.4.1, “JAXP”).

The piece of XML to validate is specified in the constructor. The constructors using more than a single argument are only relevant if you want to validate against a DTD and need to provide the location of the DTD itself - for details see the next section.

By default, Validator will validate against a DTD, but it is possible to validate against a (or multiple) Schema(s) as well. Schema validation requires an XML parser that supports it, of course.

4.1.1. DTD Validation

Validating against a DTD is straight forward if the piece of XML contains a DOCTYPE declaration with a SYSTEM identifier that can be resolved at validation time. Simply create a Validator object using one of the single argument constructors.

Example 23. Validating Against the DTD Defined in DOCTYPE

InputSource is = new InputSource(new FileInputStream(myXmlDocument));
Validator v = new Validator(is);
boolean isValid = v.isValid();

If the piece of XML doesn't contain any DOCTYPE declaration at all or it contains a DOCTYPE but you want to validate against a different DTD, you'd use one of the three argument versions of Validator's constructors. In this case the publicId argument becomes the PUBLIC and systemId the SYSTEM identifier of the DOCTYPE that is implicitly added to the piece of XML. Any existing DOCTYPE will be removed. The systemId should be a URL that can be resolved by your parser.

Example 24. Validating a Piece of XML that doesn't Contain a DOCTYPE

InputSource is = new InputSource(new FileInputStream(myXmlDocument));
Validator v = new Validator(is,
                            (new File(myDTD)).toURI().toURL().toString(),
                            myPublicId);
boolean isValid = v.isValid();

If the piece of XML already has the correct DOCTYPE declaration but the declaration either doesn't specify a SYSTEM identifier at all or you want the SYSTEM identifier to resolve to a different location you have two options:

  • Use one of the two argument constructors and specify the alternative URL as systemId.

    Example 25. Validating Against a Local DTD

    InputSource is = new InputSource(new FileInputStream(myXmlDocument));
    Validator v = new Validator(is,
                                (new File(myDTD)).toURI().toURL().toString());
    boolean isValid = v.isValid();
    

  • Use a custom EntityResolver via XMLUnit.setControlEntityResolver together with one of the single argument constructor overloads of Validator.

    This approach would allow you to use an OASIS catalog[8] in conjunction with the Apache XML Resolver library[9] to resolve the DTD location as well as the location of any other entity in your piece of XML, for example.

    Example 26. Validating Against a DTD Using Apache's XML Resolver and an XML Catalog

    InputSource is = new InputSource(new FileInputStream(myXmlDocument));
    XMLUnit.setControlEntityResolver(new CatalogResolver());
    Validator v = new Validator(is);
    boolean isValid = v.isValid();
    
    #CatalogManager.properties
    
    verbosity=1
    relative-catalogs=yes
    catalogs=/some/path/to/catalog
    prefer=public
    static-catalog=yes
    catalog-class-name=org.apache.xml.resolver.Resolver
    
    <!-- catalog file -->
    
    <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
      <public publicId="-//Some//DTD V 1.1//EN"
              uri="mydtd.dtd"/>
    </catalog>
    

4.1.2. XML Schema Validation

In order to validate against the XML Schema language Schema validation has to be enabled via the useXMLSchema method of Validator.

By default the parser will try to resolve the location of Schema definition files via a schemaLocation attribute if it is present in the piece of XML or it will try to open the Schema's URI as an URL and read from it.

The setJAXP12SchemaSource method of Validator allows you to override this behavior as long as the parser supports the http://java.sun.com/xml/jaxp/properties/schemaSource property in the way described in "JAXP 1.2 Approved CHANGES"[10].

setJAXP12SchemaSource's argument can be one of

  • A String which contains an URI.
  • An InputStream the Schema can be read from.
  • An InputSource the Schema can be read from.
  • A File the Schema can be read from.
  • An array containing any of the above.

If the property has been set using a String, the Validator class will provide its systemId as specified in the constructor when asked to resolve it. You must only use the single argument constructors if you want to avoid this behavior. If no systemId has been specified, the configured EntityResolver may still be used.

Example 27. Validating Against a Local XML Schema

InputSource is = new InputSource(new FileInputStream(myXmlDocument));
Validator v = new Validator(is);
v.useXMLSchema(true);
v.setJAXP12SchemaSource(new File(myXmlSchemaFile));
boolean isValid = v.isValid();

4.2. JUnit 3.x Convenience Methods

Both XMLAssert and XMLTestCase provide an assertXMLValid(Validator) method that will fail if Validator's isValid method returns false.

In addition several overloads of the assertXMLValid method are provided that directly correspond to similar overloads of Validator's constructor. These overloads don't support XML Schema validation at all.

Validator itself provides an assertIsValid method that will throw an AssertionFailedError if validation fails.

Neither method provides any control over the message of the AssertionFailedError in case of a failure.

4.3. Configuration Options