SQLUNINTERRUPTED

I am just a medium, SQL Server the Goal

Monthly Archives: March 2012

SQL Server 2012 — ColumnStore Index

In the past, if you have been plagued with Slow performance of your data warehouse queries, there is some good news for you. With SQL Server 2012, we can now create ColumnStore Indexes, which would improve your query performance significantly. ColumnStore indexes are targeted for queries which use a Aggregations over columns or a set of columns.

Traditional SQL Server Approach

SQL Server has traditionally been using a row store based storage. Under this approach the complete data row is stored on a Data Page, and is read or written to as a complete unit. As shown below, the Row is Stored on the page

image

In the traditional approach, lets say if we had a table, with the given definition

create table t1
(
    a int,
    b int,
    c datetime,
    d varchar(300)
)

Assume we are running the following query against the table and that the table has about 1 billion records on approximately 40 million pages.

select SUM(a),AVG(b), MAX(c) from t1

In this case, we just need to do a sum of the values of column a. Given the row store approach, we would need to read the entire row (all the columns) to execute this query, which means we would need to read all the 40 million pages for this query and then perform the aggregation operation.

How ColumnStore Index Helps

When we create a column store index on the table, it changes the way how data is organized. Data for each column which is included in the ColumnStore index is arranged separately. For example if we created a ColumnStore index on the above table on the columns (a, b and c). The columns would be organized independent of each other on different set of pages. as shown below.

image

Additionally, ColumnStore uses page compression by default. Most of the column values are similar if not same, and hence can greatly benefit from compression.

Now, with this arrangement, if we run the above mentioned query, we would only need to read pages for Column a, which in the above case would be only about ~300K(assuming 40% compression) of pages to read and sum the values of a, then another ~300K(assuming 40% compression) pages for the average of b and finally ~500(assuming 50% compression) pages to calculate the Max of c.

As mentioned, this type of organization helps queries where we are returning only some of the columns from a table. For queries where we need to return all the columns in a row, Row-Store approach works better. Typically, OLTP workloads benefit from row storage organization of data, while warehouse queries would benefit from ColumnStore index.

Restrictions on ColumnStore Indexes

The MSDN Article describes the restrictions or limitations to creating ColumnStore indexes. One thing to keep in mind, that when u create a ColumnStore index on a Table, you cannot Update the table. In Order to update the table, the quickest way would be to drop the column store index and then perform your insert, update, deletes. Once you are done with the DML’s, the ColumnStore Index can be recreated.

Example

I create two tables with the same schema but one with only a ColumnStore Index and another with a Clustered Index and a non clustered index. Both the table have about 7.7 million records.

This tables have been created on the AdventureWorks2012 sample database, which can be downloaded from here.

TableWithoutColumnStoreIndex

   1: use AdventureWorks2012

   2: go

   3:  

   4:  

   5: select * into dbo.TableWithoutColumnStoreIndex from Sales.SalesOrderDetail

   6: go

   7:  

   8: declare @count int =0

   9: while @count < 6

  10: begin

  11:     insert into dbo.TableWithoutColumnStoreIndex 

  12:     ([SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  13:     [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate])

  14:         select [SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  15:     [UnitPriceDiscount],[LineTotal],NEWID(), [ModifiedDate] from dbo.TableWithoutColumnStoreIndex

  16:  

  17:     set @count=@count+1

  18: end

  19:  

  20:  

  21: /****** Object:  Index [IDX_Clustered]    Script Date: 2/1/2012 3:01:42 AM ******/

  22: CREATE CLUSTERED INDEX [IDX_Clustered] ON [dbo].[TableWithoutColumnStoreIndex]

  23: (

  24:     [SalesOrderDetailID] ASC

  25: )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

  26: GO

  27:  

  28: /****** Object:  Index [IDX_NonClustered]    Script Date: 2/1/2012 3:01:58 AM ******/

  29: CREATE NONCLUSTERED INDEX [IDX_NonClustered] ON [dbo].[TableWithoutColumnStoreIndex]

  30: (

  31:     [SalesOrderID] ASC,

  32:     [ProductID] ASC

  33: )

  34: INCLUDE (     [ModifiedDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

  35: GO

  36:  

  37:  

TableWithColumnStoreIndex

   1: use AdventureWorks2012

   2: go

   3:  

   4: select * into dbo.TableWithColumnStoreIndex from Sales.SalesOrderDetail where 1=2

   5: go

   6:  

   7: Alter Table dbo.TableWithColumnStoreIndex Alter Column LineTotal numeric(18,6)

   8: go

   9:  

  10: insert into dbo.TableWithColumnStoreIndex 

  11: ([SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  12: [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate])

  13:     select [SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  14: [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate] from Sales.SalesOrderDetail

  15: go

  16:  

  17:  

  18: declare @count int =0

  19: while @count < 6

  20: begin

  21:     insert into dbo.TableWithColumnStoreIndex 

  22:     ([SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  23:     [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate])

  24:         select [SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  25:     [UnitPriceDiscount],[LineTotal],NEWID(), [ModifiedDate] from dbo.TableWithColumnStoreIndex

  26:     set @count=@count+1

  27: end

  28:  

  29:  

  30: -- Create a Column Store Index on the Table

  31: Create NonClustered ColumnStore Index IDX_ColumnStore on dbo.TableWithColumnStoreIndex

  32: (    [SalesOrderID],

  33:     [SalesOrderDetailID],

  34:     [CarrierTrackingNumber],

  35:     [OrderQty],

  36:     [ProductID],

  37:     [SpecialOfferID],

  38:     [UnitPrice],

  39:     [UnitPriceDiscount],

  40:     [LineTotal],

  41:     [ModifiedDate]

  42: )

I executed the below mentioned query on the two table to compare the performance. In the Query below, I am just substituting the Name of the tables([TableWithColumnStoreIndex], or  [TableWithoutColumnStoreIndex]) While executing.

   1: select

   2: ( ISNull(P.FirstName,'') + ' ' + ISNull(P.MiddleName,'') + ' ' + IsNull(P.LastName,'')) As CustomerName,

   3: SOH.OrderDate, SOH.DueDate, SOH.ShipDate,SOH.TotalDue,

   4: sum(TWC.OrderQty) As TotalOrderQuantity ,avg(TWC.UnitPrice) As AvgUnitPrice,

   5: Avg(TWC.UnitPriceDiscount) as AvgDiscountOnUnitPrice

   6: from 

   7: [dbo].[TableWithoutColumnStoreIndex] TWC 

   8: join Sales.SalesOrderHeader SOH on TWC.SalesOrderID = SOH.SalesOrderID

   9: join Sales.Customer C on SOH.CustomerID = C.CustomerID

  10: join Person.Person P on P.BusinessEntityID = C.PersonID

  11: where TWC.UnitPriceDiscount <> 0 and TWC.OrderQty > 500

  12: group by 

  13: ( ISNull(P.FirstName,'') + ' ' + ISNull(P.MiddleName,'') + ' ' + IsNull(P.LastName,'')),

  14: SOH.OrderDate, SOH.DueDate, SOH.ShipDate,SOH.TotalDue

Query Performance Against the Table without ColumnStore Index

When executed against the Table without ColumnStore index, the query table takes about 1248 millisecond.

image

Query Performance Against the Table with ColumnStore Index

When executed against the Table with ColumnStore Index, the query executes in about 218 milliseconds. A 5x performance improvement.

image

Finding disk free space and other file level information for SQL Files

Recently a friend of mine, Rahul Soni in his blog wrote a .Net code to retrieve the file information and other drive information for all SQL Server files.

While discussing this, he wanted to know if this can be done using T-SQL or not, and so the blog article follows.

Most of the file level information can be retrieved from the catalog View Sys.Master_files. To run queries against this view and see the records, the user would need the following permissions

  • CREATE DATABASE
  • ALTER ANY DATABASE
  • VIEW ANY DEFINITION

The catalog view returns information about all the files for all the databases present in SQL Server. Details about the view can be found in SQL Server Books Online.

Additionally, to find the free space information on the Drives, I used to xp_fixeddrives extended stored procedure.

Included below is the code for the stored procedure which I wrote to get this information

   1: USE [master]

   2: GO

   3: /****** Object:  Stored Procedure [dbo].[usp_getFileinformation]    Script Date: 03/13/2012 20:13:15 ******/

   4: SET ANSI_NULLS ON

   5: GO

   6: SET QUOTED_IDENTIFIER ON

   7: GO

   8: ALTER procedure [dbo].[usp_getFileinformation]

   9: As

  10: create table #fixeddrives

  11: (

  12: Drive varchar(2), 

  13: FreeSpaceMB int

  14: )

  15:  

  16: insert into #fixeddrives(Drive,FreeSpaceMB) exec xp_fixeddrives

  17: select  

  18: DB_NAME(database_id) As DatabaseName, type_desc as FileType, 

  19: name as [FileName], physical_name as FileLocation, 

  20: (size*8)  as CurrentSizeInKB, 

  21: case    max_size

  22:      when -1 then null

  23:      else max_size

  24: end As MaxFileSize,

  25: Case  

  26:     when is_percent_growth = 1 then CAST(growth AS varchar(5)) + '%'

  27:     else CAST(growth AS varchar(5)) + ' MB'

  28: End FileGrowth,

  29: SUBSTRING(physical_name, 1,1) As Drive

  30: into #temp from master.sys.master_files

  31:  

  32: select t.DatabaseName, t.FileType, t.FileName, t.FileLocation, t.CurrentSizeInKB, t.MaxFileSize,t.FileGrowth, 

  33: fd.FreeSpaceMB As FreeSpaceOnDrive_MB

  34: from  #temp t

  35: inner join #fixeddrives fd on fd.Drive = t.Drive

This stored procedure returns the following information.

image

Uploading multiple Reports to the Report Server using .Net

Continuing the series on Programming for Reporting Services, we will see how to upload multiple reports (*.rdl) files present in a folder, to the Report Server. While this can be done using Report Manager (uploading the reports one-by-one), this obviously is not an efficient way of doing it. In this post, we will see how we can automate the task using .Net.

The code, here assumes that all the *.rdl present in a file system folder would be uploaded to the same Report Server folder. if you require to add reports to multiple folder, then I suggest you create a folder hierarchy, to replicate your Report Server folder hierarchy.

For the purpose of this post, I have created the below mentioned folder hierarchy. Each folder contains some report, which needs to be uploaded to the report server. The reports would be uploaded to a similar folder hierarchy on the Report Server. If the folders do not exists on the Report Server, it would be first created and then the report uploaded.

image

Each of this folders contain some RDL file, which needs to be uploaded to the Report Server. I have created a Console Application for this purpose.

The First step would be create a method to iterate through the folder structure and read the RDL files.  Then we would need a method read the RDL files and get the report definition. This can be done using one of the IO stream classes in .Net. In my console Application, I have added a C# class named ReadReportDefinition.cs. Code included below.

   1: public static class ReadReportDefinition

   2:     {

   3:         public static void IterateFolderHierarchy(string folderpath)

   4:         {

   5:             ConfigureReports cr = new ConfigureReports();

   6:             string connectionString = "data source=." + ";Initial Catalog=PowerSaverLogs";

   7:             cr.UploadDataSource("PowerManagementDataSource", "SQL", connectionString);

   8:             try

   9:             {

  10:                 DirectoryInfo dir = new DirectoryInfo(folderpath);

  11:                 try

  12:                     { 

  13:                         DirectoryInfo[] subdirectories = 

  14:                                     dir.GetDirectories("*", SearchOption.TopDirectoryOnly);

  15:                         byte[] ReportDefinition = null;

  16:                         foreach (DirectoryInfo Dir in subdirectories)

  17:                         {

  18:                             cr.CheckReportServerFolder(Dir.Name);

  19:                             FileInfo[] ReportFiles = Dir.GetFiles("*.rdl");

  20:                             foreach (FileInfo file in ReportFiles)

  21:                             {

  22:                                 FileStream stream = File.OpenRead(file.FullName);

  23:                                 ReportDefinition = new Byte[stream.Length];

  24:                                 stream.Read(ReportDefinition, 0, (int)stream.Length);

  25:                                 stream.Close();

  26:                                 cr.UploadReports(Dir.Name,ReportDefinition, file.Name.Substring(0, (file.Name.Length - 4)));

  27:                                 ReportDefinition = null;

  28:                             }

  29:  

  30:                         }

  31:                 }

  32:                 catch(System.IO.IOException E)

  33:                     {

  34:                         Console.WriteLine("System Exception Message" + E.Message);

  35:                     }

  36:                 

  37:             }

  38:             catch(System.IO.DirectoryNotFoundException e)

  39:             {

  40:                 Console.WriteLine("System Exception Message" + e.Message);

  41:             }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

The next step, would be to add a reference to one of the Reporting Services endpoints, and then dynamically binding it to the Report Server, where the reports needs to be uploaded. For information about, how to add a reporting service endpoint and dynamically bind it to a Report Server, click here.

The next step, is to create a class or methods to call the ReportingService2010 endpoint (this is what I am using in my code) and perform the required actions. In this case, I created a new Class “Configure Reports” to perform the desired operations.

This class has 3 methods, and the constructor. In the constructor, I dynamically bind the endpoint to my Report Server instance.

  • CheckReportServerFolder: Checks if a particular folder exists on the report server on not. This uses the ReportingService2010 Listchildren method.
  • UploadReports : Uploads reports to the Report Server. This function calls the ReportingService2010 CreateReport method.
  • UploadDataSource: To upload the datasource to the report server. Uses the ReportingService2010 CreateDataSource function.
   1: using System;
   3: using System.Linq;

   2: using System.Collections.Generic;

   4: using System.Text;

   5: using System.Web.Services;

   6:  

   7: namespace UploadMultipleReports

   8: {

   9:     public class ConfigureReports

  10:     {

  11:         private const string Root = "/";

  12:         private ReportingService2010 rs;

  13:  

  14:         public ConfigureReports()

  15:         {

  16:             try

  17:             {

  18:                 // Connect to Reporting Services

  19:                 rs = new ReportingService2010();

  20:                 rs.Credentials = System.Net.CredentialCache.DefaultCredentials;

  21:                 rs.Url = "http://localhost/reportserver/ReportService2010.asmx";

  22:             }

  23:             catch (System.Exception ex)

  24:             {

  25:                 Console.WriteLine(ex.Message);

  26:             }

  27:  

  28:         }

  29:         public void CheckReportServerFolder(string foldername)

  30:         {

  31:             int count = 0;

  32:             int datasourcecount =0;

  33:             try

  34:             {

  35:                 CatalogItem[] list = rs.ListChildren("/", false);

  36:                 foreach (CatalogItem c in list)

  37:                 {

  38:                     if ((c.TypeName == "Folder") & (c.Name == foldername))

  39:                         count += 1;

  40:                     if((c.TypeName=="Data Source")& (c.Name == "PowerManagementDataSource"))

  41:                         datasourcecount+=1;

  42:                 }

  43:                 if (count == 0)

  44:                     rs.CreateFolder(foldername, "/", null);

  45:             }

  46:             catch (Exception e)

  47:             {

  48:                 Console.WriteLine(e.Message);

  49:             }

  50:         }

  51:         public void UploadReports(string ParentFolder,byte[] definition,string ReportName)

  52:         {

  53:             try

  54:             {

  55:                 string parent = "/" + ParentFolder;

  56:                 /// Create the Reports

  57:                 ///

  58:                 Warning[] warnings = null;

  59:                 rs.CreateCatalogItem("Report", ReportName, parent, true, definition, null, out warnings);

  60:                 if (warnings != null)

  61:                 {

  62:                     Console.WriteLine("Report {0} could not be uploaded to folder {1}", ReportName, ParentFolder);

  63:                     foreach(Warning war in warnings)                    

  64:                         Console.WriteLine("Warning Generated {0}", war.Message);

  65:                 }

  66:             }

  67:             catch (Exception e)

  68:             {

  69:                 Console.WriteLine(e.Message);

  70:             }

  71:  

  72:         }

  73:         public void UploadDataSource(string name, string extension, string connstring)

  74:         {

  75:             const string Parentpath = "/";

  76:             DataSourceDefinition definition = new DataSourceDefinition();

  77:             definition.CredentialRetrieval = CredentialRetrievalEnum.Integrated;

  78:             definition.ConnectString = connstring;

  79:             definition.Extension = extension;

  80:             definition.Enabled = true;

  81:             definition.EnabledSpecified = true;

  82:             definition.ImpersonateUserSpecified = true;

  83:             // Use the default prompt string.

  84:             definition.Prompt = null;

  85:             definition.WindowsCredentials = false;

  86:             try

  87:             {

  88:                 rs.CreateDataSource(name, Parentpath, false, definition, null);

  89:             }

  90:             catch (Exception e)

  91:             {

  92:                 Console.WriteLine(e.Message);

  93:             }

  94:         }

  95:     }

  96: }

Once the classes are in place, all we need to do is call them in the application main function.

   1: class Program

   2: {

   3:     static void Main(string[] args)

   4:     {

   5:         Console.WriteLine("Enter Report File Root Directory");

   6:         string folderpath = Console.ReadLine();

   7:         ReadReportDefinition.IterateFolderHierarchy(folderpath);

   8:         Console.ReadLine();

   9:     }

  10: }

SQL Server 2012 RTM’s Today

Today, SQL Server 2012 has been released to Manufacturing. Get ready for some really cool features with SQL 2012, like File Table, ColumnStore Index and a very cool reporting environment PowerView. More information available here.

Dynamically binding a Web Service reference to a Report Server Instance

One of the most important aspect of SQL Server Reporting Services, which makes it stand out amongst competition is its extensibility. If you know .Net programming, then you can almost anything with Reporting services. In this series of posts I would be talking about some things which can be done with .Net and Reporting Services.

Typically speaking, .Net with Reporting Services can be used to create any of the following.

  • .Net Applications (console, UI bases) to automate SSRS tasks.
  • Reporting Services extensions (data, security, delivery etc.) to be used with Reporting Services.
  • Reference libraries to be used directly with the Reports in Reporting Services.
  • .Net code to (web service, Forms application) to consume Reporting Services Reports.

In this post, I would be talking about how to dynamically bind a report server instance using a web service reference.

Reporting Services Web service provides a SOAP over HTTP interface for communication with the Report Server. It provides two endpoints, one for execution and another for management. The most important thing, when creating an application for SSRS is to the add the Reporting services proxy class to the project to the class. This can be done using one of the two ways.

Once you have added the web service reference, using any of the two methods described above, the most important thing is to dynamically bind the web service to a Report Server instance.

To dynamically bind the web service to any Report Server of your choice.

  • First create an instance of the SSRS web service, which was added to the project. For example, if you added the ReportService2010 endpoint, create an object for the same.
  • Set the URL property of this object to the desired Report Server web service URL.
  • Define appropriate security credentials.
ReportingService2010 rs = new ReportingService2010();

rs.Url = "http://localhost/reportserver/ReportService2010.asmx";

rs.Credentials = System.Net.CredentialCache.DefaultCredentials;

To make sure that the URL property is recognized by the compiler, you would need to add the following reference to your code.

using System.Web.Services;

At this point, we can call any method defined in the Web Service and perform the desired operation.

In the next post, I will talk about, creating an Application, which can upload, all the .RDL files present in a folder to a report server.