SQLUNINTERRUPTED

I am just a medium, SQL Server the Goal

Monthly Archives: February 2012

NESTED FOR XML Queries with SQL Server

Continuing with the series on XML, today we will talk about NESTED XML queries in SQL Server. The ability to write NESTED queries with FOR XML was introduced in SQL 2005 and was largely possible because of the introduction of XML data type in SQL. The results from a nested query could be passed on as an XML to the outer query, where it can be further processed using one of the XQuery methods.

For a complete list of all related articles please refer here.

A Simple example of a nested FOR XML query can be

DECLARE @x xml, @i datetime
SET @x=(SELECT EmployeeID, Title, HireDate
        FROM HumanResources.Employee
        WHERE EmployeeID < 5
        FOR XML RAW, TYPE)
SELECT @x
select @i=@x.value(‘/row[1]/@HireDate[1]’, ‘datetime’)
select @i

<<<< Output >>>>
<row EmployeeID="1" Title="Production Technician – WC60" HireDate="1996-07-31T00:00:00" />
<row EmployeeID="2" Title="Marketing Assistant" HireDate="1997-02-26T00:00:00" />

1996-07-31 00:00:00.000

<<<< Output >>>>

The TYPE directive is of importance here. Without the type directive the results would be returned in a nvarchar(max) format. In the above query if we do not Specify the TYPE directive, but declare @x as XML, SQL will try to do the automatic conversion and allow you to process the result as an XML type. But if we declare the variable as nvarchar(max) and then try to use the XQuery it would fail.

Now lets take a more complex example of the Nested FOR XML queries.

SELECT EmployeeID, HireDate,
       (select Contact.FirstName, Contact.LastName
            from Person.Contact Contact
            where Contact.ContactID = Employee.ContactID
        FOR XML AUTO, TYPE
       ) as Name
FROM HumanResources.Employee Employee
ORDER BY EmployeeID

—- Output —————–

EmployeeID HireDate Name
1

1996-07-31 00:00:00

<Contact FirstName="Guy" LastName="Gilbert" />
2

1997-02-26 00:00:00

<Contact FirstName="Kevin" LastName="Brown" />
3

1997-12-12 00:00:00

<Contact FirstName="Roberto" LastName="Tamburello" />
4 1998-01-05 00:00:00 <Contact FirstName="Rob" LastName="Walters" />

In the above query, we are returning the employee names as an XML. If we specify a FOR XML in the outer query also, we will get an XML style output.

As a final Example, lets write a query which will use the inner query and perform some XML type operations on it.

SELECT EmployeeID,
( SELECT Contact.FirstName, Contact.LastName
   FROM Person.Contact Contact
   –where Contact.ContactID = Employee.ContactID
   FOR XML AUTO, type ).query
                (‘for $p in /Contact
                    return
                 <Name>{string($p/@FirstName)} {string(" ")}
                        {string($p/@LastName)} {string(" ")}></Name>’)
FROM HumanResources.Employee Employee
ORDER BY EmployeeID
FOR XML AUTO, TYPE

Here we are using the inner query as an XML and doing XPath() query on this. We are using a special construct of XPath() queries called the FWOR statement to query the XML returned by the inner query. Below is the output.

EmployeeID Name
1 <Name>Guy Gilbert </Name>
2 <Name>Kevin Brown </Name>
3 <Name>Roberto Tamburello </Name>
4 <Name>Rob Walters </Name>
5 <Name>Thierry D’Hers </Name>
6 <Name>David Bradley </Name>

 

Hope this helps.

Advertisements

Working with FOR XML PATH mode in SQL 2005

PATH mode was added in SQL 2005, to allow for easier query writing experience. Using the FOR XML EXPLICIT mode, we could have got the desired results, but writing query was really cumbersome. Especially if there is multiple level of nesting.

For a complete list of related articles refer here.

FOR PATH provides a very simple way for mixing elements and attributes. It also provides a simpler way of nesting elements in the XML. In PATH mode, column names are used to represent whether the values will form an Element, an attribute. If a column name or Column alias is not specified all values will be represented inline. For example,

SELECT ‘I dont have a Column Name’ FOR XML PATH
<<<<<<Output>>>>>>             <row>I dont have a Column Name</row>         <<<<<<Output>>>>>> 

SELECT ‘I dont have a Column Name’, ‘ Another Column’ FOR XML PATH
<<<<<<Output>>>>>>            <row>I dont have a Column Name Another Column</row>     <<<<<<Output>>>>>> 

Since we are not providing column names (which in turn determines the nesting, SQL tries to add everything as one attribute.

Specifying Column Names

Column names in FOR PATH mode, have special meaning. Care should be while specifying the column names. As mentioned below, different symbols or formats have different meaning in the PATH mode.

Column names starting with @ : Path mode will treat it as an attribute value. If there a path expression prior to the @ColName, then this column name will become an attribute of the Element specified by the path expression. Else it will become an attribute of the root element. For Example…

/Employee/@EmpId : This will make EmpId an attribute of Employee Element.
@EmpID: This will create EmpID as an Attribute of the root element.

Column names Not starting with @ : Path mode will treat it as an Element. If there a path expression prior to the ColName, then this column name will become an SubElement of the Element specified by the path expression. Else it will become a SubElement of the root element. For Example…

/Employee/Name: This will make Name as SubElement of Employee Element.
Name: This will create Name as a SubElement of the root element.

Columns with Same path prefix: Columns having the same path prefix are grouped together and created as SubElement of the Element specified. If for some reason a different column with a different prefix is specified in between a sequence, a new element will be created when the path prefix is repeated. For example,

SELECT 
       FirstName "EmpName/First",
       AddressLine1 "Address/AddrLine1",
       AddressLine2 "Address/AddrLIne2",
       City "Address/City",
       MiddleName "EmpName/Middle",
       LastName "EmpName/Last"
FROM   HumanResources.EmployeeAddress E, Person.Contact C, Person.Address A
WHERE  E.EmployeeID = C.ContactID
AND    E.AddressID = A.AddressID
AND    E.EmployeeID=1
FOR XML PATH

In the above query we have specified FirstName, MiddelName and LastName as having the path prefix "/EmpName/…", which causes SQL to try and club these together and create as SubElements of the EmpName element. But in the query above, we have specified "/Address/.." immediately after the FirstName column, which restricts SQL from grouping the "/EmpName" elements together. The output in this case will be

<row>
  <EmpName>
    <First>Gustavo</First>
  </EmpName>
  <Address>
    <AddrLine1>7726 Driftwood Drive</AddrLine1>
    <City>Monroe</City>
  </Address>
  <EmpName>
    <Last>Achong</Last>
  </EmpName>
</row>

Column Names as Wildcard characters: If wildcards are used in place of column names, SQL will treat it like Columns with no names.

Column Names with Path Specified as Data: If we have a nested query returning multiple value, or if we want to specify or create a list/union like element, we can use the data() path expressions. This will tell SQL Server that the element generated are to be treated as atomic values. If there are multiple values, then a space separated list of values is created. For Example,

create table test_xml_path(a int)
go
insert into test_xml_path values (10)
go 7 

SELECT 1       as "@First",
       2                 as "@Second",
       (select a as "data()" from test_xml_path FOR XML PATH (”)) as "@Third"
FOR XML PATH

— Output ———-
<row First="1" Second="2" Third="10 10 10 10 10 10 10" />

Now that we have listed all the conventions, lets try to run a query which will use some of the naming conventions and the directives.

SELECT
       EmployeeID "@EmpID",               
       FirstName "EmpName/First",
       MiddleName "EmpName/Middle",
       LastName "EmpName/Last",
       AddressLine1 "Address/AddrLine1",
       AddressLine2 "Address/AddrLIne2",
       City "Address/City"
FROM   HumanResources.EmployeeAddress E, Person.Contact C, Person.Address A
WHERE  E.EmployeeID = C.ContactID
AND    E.AddressID = A.AddressID
AND    E.EmployeeID=1
FOR XML PATH (‘Employee’), ELEMENTS XSINIL
go

— "@EmpID" Forms an attribute for root node.
— "EmpName/First" First forms a child element of the EmpName element. Similarly for "EmpName/Middle" and "EmpName/Last"
— Similarly AddrLine1,AddrLine2,City forms a SubElements for Address.
— The ELEMENTS and XSINIL directives have the same usage as with AUTO mode, EXPLICIT mode or RAW mode.

<<<<<<<<<<<<<<<<<<<<<<< Output >>>>>>>>>>>>

<Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot; EmpID="1">
  <EmpName>
    <First>Gustavo</First>
    <Middle xsi:nil="true" />
    <Last>Achong</Last>
  </EmpName>
  <Address>
    <AddrLine1>7726 Driftwood Drive</AddrLine1>
    <AddrLIne2 xsi:nil="true" />
    <City>Monroe</City>
  </Address>
</Employee>

<<<<<<<<<<<<<<<<<<<<<<< Output >>>>>>>>>>>>

In the next post, I will discussing Nested FOR XML queries.

Converting XML documents to Relational Tables using OPENXML

In earlier posts, I discussed how to generate XML output from relational tables using the FOR XML Queries. Today we will discuss how to generate relational table like output from an XML document using the OPENXML query format.

For a complete list of related topics, please refer here.

To generate relational like output, we need to first load the XML document and then parse it. Also we need to make sure that once we have got out desired results, we remove or unload the XML document.Thankfully SQL Server provides for 2 Stored Procedures which will do the work for us.

sp_xml_preparedocument: Reads the XML input and parses it using MSXML parser. After parsing, it creates an internal tree presentation of the nodes in the XML input. It returns a unique handle to the XML document, which can then be used by OPENXML to generate a table output and sp_xml_remove_document to remove the document.

sp_xml_removedocument: Removes the internal representation of the XML document. MSXML parsers can use a lot of SQL Server Address Space, and if the documents are removed post processing, it might lead to memory issues on SQL Server.

Using OPENXML

DECLARE @idoc int
DECLARE @doc varchar(1000)
SET @doc ='<ROOT>
<Customer CustomerID="10" ContactName="Paul Henriot">
       <Order CustomerID="10" EmployeeID="5" OrderDate="1996-07-04T00:00:00">
          <OrderDetail OrderID="10248" ProductID="11" Quantity="12"/>
          <OrderDetail OrderID="10248" ProductID="42" Quantity="10"/>
       </Order>
</Customer>
<Customer CustomerID="20" ContactName="Carlos Gonzlez">
       <Order CustomerID="20" EmployeeID="3" OrderDate="1996-08-16T00:00:00">
              <OrderDetail OrderID="10283" ProductID="72" Quantity="3"/>
       </Order>
</Customer>
</ROOT>’
–Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc

— Execute a SELECT statement that uses the OPENXML rowset provider.
SELECT    *
FROM       OPENXML (@idoc, ‘/ROOT/Customer’,1)
            WITH (CustomerID  varchar(10),
                  ContactName varchar(20))

— Remove the document
EXEC sp_xml_removedocument @idoc

Before we get to the output, lets spend a few minutes on the query itself.

The first part of the query, takes an AML document and parses it using the PrepareDocument command. It returns an handle to this parsed document, which is assigned to the @idoc variable.

Later we are using the OPENXML Command to convert this parsed document into a tabular representation. In the end we use the RemoveDocument to remove the internal representation to the XML.

<<<<<<<<<<<<<<<<<< Snippet Output >>>>>>>>>>>>>

CustomerID CustomerName
10 Paul Henriot
20 Carlos Gonzlez

<<<<<<<<<<<<<<<<<< Snippet Output >>>>>>>>>>>>>>

The OPENXML is basically a rowset provider. It reads the parsed XML and then provides a rowset view of the XML.

The General syntax of the OPENXML is

OPENXML(
idoc int [in],
rowpattern nvarchar[in],
[flags byte[in]])
[WITH (SchemaDeclaration | TableName)]
)

The idoc int, variable is used to provide the Handle for the parsed XML document prepared using the sp_xml_prepared_document.

RowPattern is XPATH for the nodes in the XML document, which will form a part of the rowset output. In the above example, I have specified “/ROOT/Customer”, which means that create a rowset output for the Customer element in the XML document. If the changed the OPENXML rowpattern to as mentioned below, we will get all the ORDER elements as rowset.

SELECT    *
FROM       OPENXML (@idoc, ‘/ROOT/Customer/Order’,1)
            WITH (CustomerID  varchar(10),
          EmployeeID int,
                  OrderDate DATETIME)

FLAGS: The flags parameter is used to indicate the mapping that should be used between the XML data and the relational rowset,  and how the spill-over column should be filled. flags is an optional input parameter, and can be one of these values.  
0 — Defaults to attribute-centric mapping.
1 — Use the attribute-centric mapping. Can be combined with XML_ELEMENTS; in which case, attribute-centric mapping is applied first, and then element-centric mapping is applied for all columns not yet dealt with.
2 — Use the element-centric mapping. Can be combined with XML_ATTRIBUTES; in which case, attribute-centric mapping is applied first, and then element-centric mapping is applied for all columns not yet dealt with.
8 — Can be combined (logical OR) with XML_ATTRIBUTES or XML_ELEMENTS.

Lastly, we have the WITH clause, which defines the schema for the rowset output. We can either specify the details of the columns to be used, for we can provide a TableName, if the table with the desired schema already exists.

In SQL 2005 and SQL 2008, the OPENXML and sp_xml_preparedocument have been enhanced.

OPENXML in the WITH clause support has been added to specify the new data types introduced in SQL.

sp_xml_prepare_document can now accept input in the form off an XML data type variable or an XML column.

Please Note: In SQL 2005 and 2008, support was added for usage of XQuery to play with XML. I would recommended using XQquery nodes() mehod when trying to parse an XML document. Nodes() method typically performs better than OPENXML, in terms of CPU and memory usage.

I will be discussing usage of XQuery support for XML in subsequent posts.

Converting Relational Tables to XML documents using FOR XML EXPLICIT

Continuing with the series on XML, today we will talk about the FOR XML EXPLICIT clause which can be used with relation database queries to produce XML output. For a complete list of related topics, please refer here.

As mentioned above, we can use the FOR XML EXPICIT clause in the T-SQL queries to produce XML output. FOR XML has different flavors which can be used to produced desired results. Examples for SQL 2000 use the pubs/Northwind database, while SQL 2005 and the SQL 2008 examples are created over the AdventureWorks database which can be downloaded online.

Using the RAW and AUTO mode, we did not have any control over the Nesting of Elements in the XML result. The EXPLICIT mode, provides us more control over the nesting of the elements, but doing so requires a careful understanding of what is the desired nesting level and how to implement the same in your Query.

To use the EXPLICIT mode, we need to write the query in a way to produce the universal table. This table is nothing but a relational structure defining the Nesting of the Elements. Proper order by clauses have to be used to make sure that a Parent and all its children and their children are all together in the universal table.

To accomplish this, we have to always define the 1st and 2nd column in our select query as the TAG and PARENT column. The TAG column will indicate the Nest Level for each element in the XML, while the Parent column would be used to represent the Parent/Child relationship.

An example of a universal table is presented below.

Tag Parent Employee!1!EmpID EmpName!2!FName EmpName!2!LName
1 NULL 1 NULL NULL
2 1 1 Guy Gilbert
1 NULL 2 NULL NULL
2 1 2 Kevin Brown
1 NULL 3 NULL NULL
2 1 3 Roberto Tamburello
1 NULL 4 NULL NULL
2 1 4 Rob Walters
1 NULL 5 NULL NULL
2 1 5 Thierry D’Hers

When SQL Server tries to convert this output into XML, it will pick all ELEMENTS with TAG value of 1 and Parent value of 0, and make them Root Elements. Similarly, elements with TAG value 2 and Parent Value 1, and occurring immediately after the Root Element will be made a child of the Root Element, and so on.

Once again the order of data in the universal table will define the nesting. For Example consider the query below, where I have ordered the result using TAG column.

SELECT 1                    as Tag,
         NULL                 as Parent,
         Customers.CustomerID as [Customer!1!CustomerID],
         NULL                 as [Order!2!OrderID]
FROM Customers
UNION ALL
SELECT 2,
         1,
         Customers.CustomerID,
         Orders.OrderID
FROM Customers, Orders
WHERE Customers.CustomerID = Orders.CustomerID
ORDER BY TAG

<<<<<<<<<<<<<<<<<<<<<< Universal Table Snippet >>>>>>>>>>>>>>>

TAG Parent

Customer!1!CustomerID

Order!2!OrderID

1 NULL WANDK NULL
1 NULL WARTH NULL
1 NULL WELLI NULL
1 NULL WHITC NULL
1 NULL WILMK NULL
1 NULL WOLZA NULL
2 1 ALFKI 10643
2 1 ALFKI 10692
2 1 ALFKI 10702
2 1 ALFKI 10835
2 1 ALFKI 10952

In this case, SQL will create an XML with multiple root nodes, with the Last root Node having multiple child!

Also note the naming mechanism of the columns used with EXPLICIT mode. The Columns have to named as

ElementName!TagNumber!AttributeName!directive

As with RAW and AUTO mode, EXPLICIT mode can also be used in conjunction some directives. SQL 2000 directives are listed below.

ID : An element attribute can be specified to be an ID type attribute. IDREF and IDREFS attributes can then be used to refer to them, enabling intra-document links. If XMLDATA is not requested, this keyword has no effect.

IDREF : Attributes specified as IDREF can be used to refer to ID type attributes, enabling intra-document links. If XMLDATA is not requested, this keyword has no effect.

IDREFS: Attributes specified as IDREFS can be used to refer to ID type attributes, enabling intra-document links. If XMLDATA is not requested, this keyword has no effect.

HIDE: The attribute is not displayed. This may be useful for ordering the result by an attribute that will not appear in the result.

ELEMENT: This does not generate an attribute. Instead it generates a contained element with the specified name (or generate contained element directly if no attribute name is specified). The contained data is encoded as an entity (for example, the < character becomes &lt;). This keyword can be combined with ID, IDREF, or IDREFS.

XML: This is the same as an element directive except that no entity encoding takes place (for example, the < character remains <). This directive is not allowed with any other directive except hide.

XMLTEXT: The column content should be wrapped in a single tag that will be integrated with the rest of the document. This directive is useful in fetching overflow (unconsumed) XML data stored in a column by OPENXML. If AttributeName is specified, the tag name is replaced by the specified name; otherwise, the attribute is appended to the current list of attributes of the enclosing elements and by putting the content at the beginning of the containment without entity encoding. The column with this directive must be a text type (varchar, nvarchar, char, nchar, text, ntext). This directive can be used only with hide. This directive is useful in fetching overflow data stored in a column. If the content is not a well-formed XML, the behavior is undefined.

CDATA: Contains the data by wrapping it with a CDATA section. The content is not entity encoded. The original data type must be a text type (varchar, nvarchar, text, ntext). This directive can be used only with hide. When this directive is used, AttributeName must not be specified.

I will be using some of these directives in a single query to display the results.

SELECT 1                      as Tag,
         NULL                 as Parent,
         Customers.CustomerID as [Customer!1!CustomerID!ID],
     NULL               as [CustomerName!2!Name!IDREF],
     NULL                  as [CustomerName!2!Phone!Hide],
         NULL                 as [Order!3!OrderID],
     NULL              as [Order!3!OrderDate!ELEMENT]
FROM Customers
UNION ALL
SELECT 2,
         1,
         Customers.CustomerID,
     Customers.ContactName,
     Customers.Phone,   
         NULL,
     NULL
FROM Customers
UNION ALL
SELECT 3,
    2,
    Customers.CustomerID,
    Customers.ContactName,
    NULL,
    Orders.OrderID,
    Orders.OrderDate
FROM Customers, Orders
WHERE Customers.CustomerID = Orders.CustomerID
ORDER BY [Customer!1!CustomerID!ID],[CustomerName!2!Name!IDREF], [Order!3!OrderID]
FOR XML EXPLICIT, XMLDATA

<<<<<<<<<<<<<<<<<< Snippet Output >>>>>>>>>>>>>>>>>

<Schema name="Schema5" xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes">
<ElementType name="Customer" content="mixed" model="open">
        <AttributeType name="CustomerID" dt:type="id"/>
        <attribute type="CustomerID"/>
</ElementType>
<ElementType name="CustomerName" content="mixed" model="open">
    <AttributeType name="Name" dt:type="idref"/><attribute type="Name"/>
</ElementType>
<ElementType name="Order" content="mixed" model="open">
    <AttributeType name="OrderID" dt:type="i4"/><attribute type="OrderID"/>
<Element type="OrderDate"/></ElementType>
    <ElementType name="OrderDate" content="textOnly" model="closed" dt:type="dateTime"/>
    </ElementType>
</ElementType>
</Schema>
<Customer xmlns="x-schema:#Schema5" CustomerID="ALFKI">
    <CustomerName Name="Maria Anders">
        <Order OrderID="10643">
            <OrderDate>1997-08-25T00:00:00</OrderDate>
        </Order>
        <Order OrderID="10692">
            <OrderDate>1997-10-03T00:00:00</OrderDate>
        </Order>
        <Order OrderID="10702">
            <OrderDate>1997-10-13T00:00:00</OrderDate>
        </Order>
        <Order OrderID="10835">
            <OrderDate>1998-01-15T00:00:00</OrderDate>
        </Order>

<<<<<<<<<<<<<<<<<< Snippet Output>>>>>>>>>>>>>>>>>>>>

The same directives and constructs can be used with SQL 2005 and SQL 2008.

Converting Relational Tables to XML documents using FOR XML AUTO

Continuing with the series on XML, today we will talk about the FOR XML AUTO clause which can be used with relation database queries to produce XML output. For a complete list of related topics, please refer here.

As mentioned above, we can use the FOR XML AUTO clause in the T-SQL queries to produce XML output. FOR XML has different flavors which can be used to produced desired results. Examples for SQL 2000 use the pubs database, while those for SQL 2005 and the SQL 2008 examples are created over the AdventureWorks database which can be downloaded online.

FOR XML AUTO Mode

When we use the AUTO mode, we get an XML representation of each row in the T-SQL result set.  When using AUTO mode, SQL server tries to create a nested XML output, in case of multi-table joins. For every table, which has at lease one column from the table in the select list an element is added, correspondingly, the column for that table become the element attributes.

The hierarchy (nesting of the elements) in the result set is based on the order of tables identified by the columns specified in the SELECT clause; therefore, the order in which column names are specified in the SELECT clause is significant. The tables are identified and nested in the order in which the column names are listed in the SELECT clause. The first, leftmost table identified forms the top element in the resulting XML document. The second leftmost table (identified by columns in the SELECT statement) forms a sub-element within the top element, and so on.

The basic syntax of the FOR XML AUTO mode, is similar to the RAW mode.

SELECT <col1, col2,….>
From <table>
FOR XML AUTO, <directives,..>

SELECT Customers.CustomerID, Orders.OrderID, Customers.ContactName
FROM Customers, Orders
WHERE Customers.CustomerID = Orders.CustomerID
FOR XML AUTO

<<<<<<<< Snippet output >>>>>>>>

<Customers CustomerID="ALFKI" ContactName="Maria Anders">
  <Orders OrderID="10643"/>
  <Orders OrderID="10692"/>
  <Orders OrderID="10702"/>
  <Orders OrderID="10835"/>
  <Orders OrderID="10952"/>
  <Orders OrderID="11011"/>
</Customers>

<<<<<<<< Snippet output >>>>>>>>

When the Element, directive is supplied, the column values are converted to Child Elements for the Table Element as mentioned.

SELECT Customers.CustomerID, Orders.OrderID, Customers.ContactName
FROM Customers, Orders
WHERE Customers.CustomerID = Orders.CustomerID
FOR XML AUTO, ELEMENTS

<Customers>
    <CustomerID>ALFKI</CustomerID>
    <ContactName>Maria Anders</ContactName>
    <Orders>
        <OrderID>10643</OrderID>
    </Orders>
    <Orders>
        <OrderID>10692</OrderID>
    </Orders>
    …
</Customer>

With SQL 2005 similar enhancements were made to the XML AUTO mode. We can use the other directives with AUTO mode, like we do for RAW mode as described in my post.