Importing and Processing data from XML files into SQL Server tables

[Origin]: https://www.mssqltips.com/sqlservertip/2899/importing-and-processing-data-from-xml-files-into-sql-server-tables/

Problem

In my last article, I talked about how you can use an  FTP task in SSIS to download files from an FTP server. But what if the file you have downloaded is an XML file and you need to import this data from the XML file into a SQL Server table? How do you process/parse XML data into SQL Server tables?

Solution

There are different ways to achieve this task of importing data from an XML file into a SQL Server table, but I am going to demonstrate one of easiest ways to accomplish this task.

These are the steps I performed for importing data into SQL Server and then parsing the XML into a relational format.

  • Import XML data from an XML file into SQL Server table using the OPENROWSET function
  • Parse the XML data using the OPENXML function

Importing XML data from XML file using OPENROWSET

I have an XML file downloaded from my FTP location to a local folder and data in this XML file looks like this:

Importing XML data from XML file using OPENROWSET

Now in order to import data from the XML file to a table in SQL Server, I am using the OPENROWSET function as you can see below.

In the script below, I am first creating a table with a column of data type XML and then reading the XML data from the file using the OPENROWSET function by specifying the file location and name of the XML file as you can see below:

CREATE DATABASE OPENXMLTesting
GO


USE OPENXMLTesting
GO


CREATE TABLE XMLwithOpenXML
(
Id INT IDENTITY PRIMARY KEY,
XMLData XML,
LoadedDateTime DATETIME
)


INSERT INTO XMLwithOpenXML(XMLData, LoadedDateTime)
SELECT CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE() 
FROM OPENROWSET(BULK 'D:\OpenXMLTesting.xml', SINGLE_BLOB) AS x;


SELECT * FROM XMLwithOpenXML

When I query the table in which I have imported the XML data, it looks like this. The XMLData column is an XML data type, it will output a hyperlink as shown below:

As XMLData column is of XML data type, it will give an hyperlink

Clicking on the hyperlink, in the above image, will open another tab within SSMS with the XML data displayed as shown below.

<ROOT>
  <Customers>
    <Customer CustomerID="C001" CustomerName="Arshad Ali">
      <Orders>
        <Order OrderID="10248" OrderDate="2012-07-04T00:00:00">
          <OrderDetail ProductID="10" Quantity="5" />
          <OrderDetail ProductID="11" Quantity="12" />
          <OrderDetail ProductID="42" Quantity="10" />
        </Order>
      </Orders>
      <Address> Address line 1, 2, 3</Address>
    </Customer>
    <Customer CustomerID="C002" CustomerName="Paul Henriot">
      <Orders>
        <Order OrderID="10245" OrderDate="2011-07-04T00:00:00">
          <OrderDetail ProductID="11" Quantity="12" />
          <OrderDetail ProductID="42" Quantity="10" />
        </Order>
      </Orders>
      <Address> Address line 5, 6, 7</Address>
    </Customer>
    <Customer CustomerID="C003" CustomerName="Carlos Gonzlez">
      <Orders>
        <Order OrderID="10283" OrderDate="2012-08-16T00:00:00">
          <OrderDetail ProductID="72" Quantity="3" />
        </Order>
      </Orders>
      <Address> Address line 1, 4, 5</Address>
    </Customer>
  </Customers>
</ROOT>

Process XML data using OPENXML function

Now as I said before, XML data stored in a column of data type XML can be processed either by using XML functions available in SQL Server or by using the sp_xml_preparedocument stored procedure along with the OPENXML function.

We will first call the sp_xml_preparedocument stored procedure by specifying the XML data which will then output the handle of the XML data that it has prepared and stored in internal cache.

Then we will use the handle returned by the sp_xml_preparedocument stored procedure in the OPENXML function to open the XML data and read it.

Note: the sp_xml_preparedocument stored procedure stores the XML data in SQL Server’s internal cache, it is essential to release this stored XML data from internal cache by calling the sp_xml_removedocument stored procedure. We should call the sp_xml_removedocument stored procedure as early possible, so that internal cache can be freed for other usage.

USE OPENXMLTesting
GO


DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)


SELECT @XML = XMLData FROM XMLwithOpenXML


EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML


SELECT CustomerID, CustomerName, Address
FROM OPENXML(@hDoc, 'ROOT/Customers/Customer')
WITH 
(
CustomerID [varchar](50) '@CustomerID',
CustomerName [varchar](100) '@CustomerName',
Address [varchar](100) 'Address'
)


EXEC sp_xml_removedocument @hDoc
GO

From the above XML data, I want to retrieve all the customer information and hence I am navigating to the Customer element and querying CustomerID and CustomerName (please note the use of “@” before the name of the attribute) attributes and Address element in the above SELECT statement using the OPENXML function.

The structure of the resultset can be determined with the “WITH” clause as shown above.

Process XML data using OPENXML function

From the above XML data, I now want to retrieve all the customer information along with OrderID and OrderDate placed by each individual customer and hence I am navigating to the Order element and then querying OrderID and OrderDate attributes.

If we want to navigate back to the parent or grand parent level and get data from there, we need to use “../” to read the parent’s data and “../../” to read the grand parent’s data and so on.

USE OPENXMLTesting
GO


DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)


SELECT @XML = XMLData FROM XMLwithOpenXML


EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML


SELECT CustomerID, CustomerName, Address, OrderID, OrderDate
FROM OPENXML(@hDoc, 'ROOT/Customers/Customer/Orders/Order')
WITH 
(
CustomerID [varchar](50) '../../@CustomerID',
CustomerName [varchar](100) '../../@CustomerName',
Address [varchar](100) '../../Address',
OrderID [varchar](1000) '@OrderID',
OrderDate datetime '@OrderDate'
)


EXEC sp_xml_removedocument @hDoc
GO

The result of the above query can be seen in the image below. You can see below all the customers and all the orders placed by each customer.

querying CustomerID and CustomerName

Now let’s go one level deeper. This time from the above XML data, I want to retrieve all the customer information and their orders along with ProductID and Quantity from each order placed. And hence, as you can see below I am navigating to the OrderDetail and retrieving the ProductID and Quantity attributes’ values. At the same time I am using “../” to reach the parent level to get Order information available at the parent level whereas I am using “../../../” to reach to the great grand parent level to grab Customer information as shown below:

USE OPENXMLTesting
GO


DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)


SELECT @XML = XMLData FROM XMLwithOpenXML


EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML


SELECT CustomerID, CustomerName, Address, OrderID, OrderDate, ProductID, Quantity
FROM OPENXML(@hDoc, 'ROOT/Customers/Customer/Orders/Order/OrderDetail')
WITH 
(
CustomerID [varchar](50) '../../../@CustomerID',
CustomerName [varchar](100) '../../../@CustomerName',
Address [varchar](100) '../../../Address',
OrderID [varchar](1000) '../@OrderID',
OrderDate datetime '../@OrderDate',
ProductID [varchar](50) '@ProductID',
Quantity int '@Quantity'
)


EXEC sp_xml_removedocument @hDoc
GO

The result of the above query can be seen in the image below. You can see all the customer information and their orders along with ProductID and Quantity from each order placed.

The result of the above query
Next Steps
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s