Quick and easy XML validation using Powershell

By Glib Briia on March 29, 2016

The brief overview on how to validate XML against XSD in Powershell using PSCX module.

Since XML files are widely used, the necessity arises of making sure they are well formed and contain valid data before committing/deploying the code elsewhere. To accomplish this goal the validation check needs to be performed against XML schema or DTD of the corresponding XML document or at least verify if created XML has no syntax mistakes.

Lets say there is a sample XML file (keeping it simple):

<?xml version="1.0" encoding="UTF-8"?>
<Users>
    <User>
       <FirstName>Alex</FirstName>
       <LastName>Smith</LastName>
       <DOB>12-01-1980</DOB>
    </User>
    <User>
       <FirstName>Yevhen</FirstName>
       <LastName>Baker</LastName>
       <DOB>01-04-1991</DOB>
    </User>
</Users>

And also corresponding XSD needs to be created. There are numerous ways to generate in automatically online (e.g. http://www.freeformatter.com/xsd-generator.html). I prefer to do it manually, since all this tools provide just basic generation capabilities, not able to handle patterns, required/optional nodes/attributes or custom types, so manual interaction is needed anyway.

So, the XSD for above XML appears as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="Users">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="User" maxOccurs="unbounded" minOccurs="1">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element type="xs:string" name="FirstName"/>
                            <xs:element type="xs:string" name="LastName"/>
                            <xs:element name="DOB">
                                <xs:simpleType>
                                    <xs:restriction base="xs:string">
                                        <xs:pattern value="[0-9]{2}-[0-9]{2}-[0-9]{4}"/>
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Powershell itself has no proper XML validation, surely you can try to read the file (How to read XML file in Powershell) and catch any exceptions if the document can’t be converted to XML object, but it won’t solve the problem of validation against XSD.

So, to make the proper validation we’ll use PSCX module.

PSCX requires Powershell version 3 or higher.

To start using it, just download Pscx-3.2.0.msi and install following installer instructions.

After installation complete, you should be able to execute Import-Module PSCX in PS terminal without any exceptions.

In order to perform basic validation, checking only if the XML is well-formed, it is enough to call the Test-Xml finction from PSCX module:

Test-Xml *path_to_xml_file* –Verbose

Organizing this in function:

function Validate-XML-Wellformed($xmlfile)
{
    return Test-Xml $xmlfile Verbose
}

$result = Validate-XML-Wellformed "users.xml"

Write-Host "Is valid: " $result

And calling it with valid data the output would be:

So far so good, lets update XML in order to get validation failure:

...
 <User>
       <FirstName>Alex</First>
       <LastName>Smith</LastName>
       <DOB>12-01-1980</DOB>
 </User>
...

Now lets add the function for validating against XSD, so the full code will appear as follows:

function Validate-XML-Wellformed($xmlfile)
{
    return Test-Xml $xmlfile Verbose
}

function Validate-XML-AgainstXSD($xmlfile, $xsd)
{
    return Test-Xml $xmlfile -Validate -SchemaPath $xsd Verbose
}

$result = Validate-XML-AgainstXSD "users.xml" "users.xsd"

Write-Host "Is valid: " $result

And changing the first user DOB to be in incorrect format:

..
<DOB>12-1-1980</DOB>
...

We’ll get the following output which is quite self explanatory (since we expectring the DOB to be [0-9]{2}-[0-9]{2}-[0-9]{4} ):

In this simple example the XML file can be easily verified without any additional tools, only by trained eye, but in real life, when XMLs usually are not as simple and quickly check their validity is just impossible, this module would be very handy.

The good example of usage might be plugging it in to your CI to recursively search and validate all or selected XML files on compile stage, so be able to prevent any issues at the very beginning.