SQLUNINTERRUPTED

I am just a medium, SQL Server the Goal

Tag Archives: XML Indexes

Working with XML Indexes in SQL Server

1 Comment Posted by Sourabh Agarwal on February 22, 2012

Earlier we have seen how SQL Server 2005 allows us to create XML instances in the database, using the XML Data type, introduced in SQL Server. With the introduction of SQL Server XML data types, we also have the ability to create XML indexes. For a complete details of posts related to XML, please refer here.

SQL Server allows us to create XML indexes on XML data type columns, just like it allows us to create indexes on other Data type columns. There are 2 kinds of indexes, which are allowed by SQL Server.

Primary XML indexes
Secondary XML indexes

Why do we need XML Indexes?

Columns with XML data types contain XML instances which can be fairly large (almost 2 GB), which means trying to work on XML documents (loading then in memory and then shredding them to read values) can be an expensive operation. For example, consider a table with the below mentioned columns

   1:  Create Table XMLIndex

   2:  (

   3:  Col1 int,

   4:  Col2 int,

   5:  Col3 XML (SomeXMLSchemaCollection)

   6:  )

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

Assuming the above table contains about 60K records, where each record contains an XML document in the XML column. Running a query like the one below,

   1:  Select Col1, Col2, Col3.query('.')

   2:  from XMLIndex

   3:  where Col3.exist('//SomeElement[.=”SomeValue”]')=1

will need to load all XML instances in Memory and then shred them to figure out the XML Instances where the Exist Condition hold true. This would be a very costly operation, leading to a lot of CPU and memory usage on SQL Server.

So if we have an application or a Database environment, where there are lot of queries running against the XML instances and it requires shredding of XML in SQL Memory, XML indexes will help. But please note that there is a cost associated with XML indexes when modifications happen on the XML instances.

Primary XML Indexes

Creating a Primary XML index, created a persistent representation of the XML instance in the XML columns. When we create a Primary index on the XML column, it will create a row-set representation of the XML. One row is created for every node in the XML.

Each row stores the following node information:

Tag name such as an element or attribute name.
Node value.
Node type such as an element node, attribute node, or text node.
Document order information, represented by an internal node identifier.
Path from each node to the root of the XML tree. This column is searched for path expressions in the query.
Primary key of the base table. The primary key of the base table is duplicated in the primary XML index for a back join with the base table.

Secondary XML Indexes

We can also create secondary XML indexes on XML Columns where a Primary XML index has already been created. There are 3 types of secondary XML Indexes which can be created in XML columns.

PATH XML Index

The PATH index is created on top of the Primary XML index and has the PATH and VALUE columns in the Primary index row-set table. So for every node, the PATH index will store the Path of that node from the root and the Value of that Node. This will be useful for queries which use PATH expressions for Iterations. The Index keys are in the following format {PATH, VALUES}.

VALUE XML Index

The VALUE index is created on top of the Primary XML index and has the VALUE and PATH columns in the Primary index row-set table. So for every node, the VALUE index will store the value of that node and the Path of the node from the Root Column. This will be useful for queries which use node values. For example columns which use the VALUE XML data type method. The Index keys are in the following order {VALUES, PATH}.

Property XML Index:

The Property Index is created on “Primary Key of base Table” along with the Path and Value columns on the Primary XML Index row-set. This can be useful in queries where the Value() method are being used, along with the Primary Key for the base table being specified in the query. For example, a query of the following kind will probably benefit from the Property index.

   1:  Select

   2:  XMLColum.query('.'), Col1, Col2

   3:  from XMLIndex

   4:  where Col1 = SomeValue and

   5:  XMLColumn.value('(/root/element1/@elementID)[1]')=10

-- Assuming that Col1 is the Primary Key for the base Table.

The Property Index keys are in the order {PKey, PATH, VALUE}.

Considerations for XML Indexes

Creating a Primary/Secondary XML Index requires certain steps to be performed on the base tables, before the XML index can be created. Some of the important considerations are

Before creating an XML Primary Index, the base table must contain a Clustered Index on the Primary Key of the table.
Primary Clustered primary key on the base table cannot be altered if we have an XML index on the base table.
Primary XML index can be created on only one XML column. Non-XML indexes cannot be created with the XML columns as on of the key column. However XML columns can be added as INCLUDE columns in non-XML indexes.
ONLINE option for XML indexes is always OFF.
XML Indexes cannot be created on XML columns in views, Table values variable or on XML variables.
When creating XML indexes SET ARITHMATIC ABORT option should be ON, and any connection trying to modify XML data must have the ARTHIMATIC ABORT ON, or else the XML methods will fail.

When altering indexes on a Table, care must be taken to perform maintenance of XML indexes separately. For example, ALTER INDEX ALL with an ONLINE=ON option will fail for tables on which XML indexes have been created, since XML indexes do not support ONLINE operations.

For dropping XML indexes, a different syntax must be used. The earlier TableName.Indexname does not work with XML indexes. We need to use the command

DROP INDEX <INDEX_NAME> ON <TABLE_NAME>

Dropping a Primary Index, will drop all secondary indexes associated with the table.

In the next posts, I will be talking about whether we should use XML indexes or not.

SQL Engine, XML SQL Server, XML, XML Indexes

SQLUNINTERRUPTED

Tag Archives: XML Indexes

Working with XML Indexes in SQL Server

Statistics

Archives

Categories

Meta

Author

Follow Blog via Email

Blogroll

Recent Posts

Feeds