I am just a medium, SQL Server the Goal

Archive for the ‘Performance’ Category

Optimizing your backup – Tips and Tricks

Posted by Sourabh Agarwal on September 25, 2013

Last week at the TechCon2013, sponsored by the SQLBangalore User Group and other User Groups in Bangalore, I did a talk on how to optimize your backup/restore. I focused on the three main aspects of Backup/Restore and talked about some of the things which can be done to optimize them

  1. Read from Disk
  2. Store in Memory
  3. Write to Disk/Tape

For the Demos, I used 3 databases as described.

Database 1

  • Size – 12 GB
  • DB Name – AdventureWorks2012SingleFile
  • Number of Data Files: 1
  • Disk Allocation Size: 4KB

Database 2

  • Size – 12 GB
  • DB Name – AdventureWorks2012MultiFile
  • Number of Data Files: 4 (spread on 4 different physical disk)
  • Disk Allocation Size: 64KB

Database 3

  • Size – 4 GB
  • DB Name – AnotherSingleFileDatabase
  • Number of Data Files: 1
  • Disk Allocation Size: 64KB

The slide deck for the presentation is included below.

Optimizing Reads

To optimize reads, I focused mainly on the following

  1. Reading from a single MDF file vs. reading from multiple Data files
  2. Using higher disk allocation unit, I would be writing another post on effect of disk allocation unit size to SQL backup.

All backups were performed to a Null device. The idea was to test the read performance. I saw the following backup performance for the databases mentioned above.


   2: backup database [AdventureWorks2012SingleFile]

   3: to disk = 'NUL' WITH COPY_ONLY

   4: --BACKUP DATABASE successfully processed 1521753 pages in 26.510 seconds (448.460 MB/sec).


   6: backup database [AnotherSingleFileDatabase]

   7: to disk = 'NUL' WITH COPY_ONLY

   8: --BACKUP DATABASE successfully processed 569482 pages in 9.202 seconds (483.489 MB/sec).


  10: backup database [AdventureWorks2012MultiFile]

  11: to disk = 'NUL' WITH COPY_ONLY

  12: --BACKUP DATABASE successfully processed 1599457 pages in 25.244 seconds (494.998 MB/sec).

I get better performance with later disk allocation sizes and multiple sizes.

Optimizing Writes

To optimize reads, I focused mainly on the following

  1. Writing to a single backup file vs. writing to multiple backup files
  2. Using Compression
  3. Changing the MaxTransferSize option for the backups.

All backups were performed to a 4K allocation unit drive, except for the multiple file backup, which were performed on a 64K Allocation unit disk. The following backup performance was observed.

   1: backup database [AnotherSingleFileDatabase]

   2: to disk = 'C:\AnotherSingleFileDatabase.bak'

   3: Go

   4: --BACKUP DATABASE successfully processed 569493 pages in 52.510 seconds (84.729 MB/sec).


   6: -- Delete the file to preserve Space

   7: xp_cmdshell 'del C:\AnotherSingleFileDatabase.bak'

   8: -- Backup taken to Multiple Files on Different drives of 64K disk allocation Units

   9: backup database [AnotherSingleFileDatabase]

  10: to 

  11: disk = 'F:\AnotherSingleFileDatabase_BackupFile1.bak',

  12: disk = 'G:\AnotherSingleFileDatabase_BackupFile1.bak',

  13: disk = 'H:\AnotherSingleFileDatabase_BackupFile1.bak',

  14: disk = 'E:\AnotherSingleFileDatabase_BackupFile1.bak'

  15: Go

  16: --BACKUP DATABASE successfully processed 569490 pages in 33.600 seconds (132.414 MB/sec).

  17: --- Delete all the files

  18: xp_cmdshell 'del E:\AnotherSingleFileDatabase_BackupFile1.bak'

  19: go

  20: xp_cmdshell 'del H:\AnotherSingleFileDatabase_BackupFile1.bak'

  21: go

  22: xp_cmdshell 'del G:\AnotherSingleFileDatabase_BackupFile1.bak'

  23: go

  24: xp_cmdshell 'del F:\AnotherSingleFileDatabase_BackupFile1.bak'

  25: go


  27: -- Backup With COMPRESSION

  28: backup database [AnotherSingleFileDatabase]

  29: to disk = 'C:\AnotherSingleFileDatabase.bak'


  31: Go

  32: ---BACKUP DATABASE successfully processed 569491 pages in 25.721 seconds (172.977 MB/sec).

  33: xp_cmdshell 'del C:\AnotherSingleFileDatabase.bak'

  34: Go



  37: DBCC TRACEON(3605,3213,-1)

  38: -- Writes backup Configuration related informtion to the trace file.


  40: backup database [AnotherSingleFileDatabase]

  41: to disk = 'C:\AnotherSingleFileDatabase.bak'


  43: go

  44: xp_cmdshell 'del C:\AnotherSingleFileDatabase.bak'

  45: Go

  46: sp_readerrorlog

  47: go


  49: backup database [AnotherSingleFileDatabase]

  50: to disk = 'C:\AnotherSingleFileDatabase.bak'


  52: Go

  53: -- BACKUP DATABASE successfully processed 569490 pages in 21.920 seconds (202.971 MB/sec).

  54: xp_cmdshell 'del C:\AnotherSingleFileDatabase.bak'

  55: Go

We got better write performance when using a 64K allocation unit disk and when using Multiple Files. Further performance improvement was observed with Compression and by increasing the MaxTransferSize for the backups.

Optimizing Memory

To optimize reads, I focused mainly on changing the BufferCount option for the backups. This would help create more buffers in the memory for the buffer.

Total Memory used = BufferCount * MaxTransferSize

While i did not have a demo for the performance improvement when increasing the number of buffer, I had a demo on the side affect of increasing the BufferCount/MaxTransferSize to be very high.

   1: backup database [AnotherSingleFileDatabase]

   2: to disk = 'C:\AnotherSingleFileDatabase.bak'


   4: Go

   5: -- Error Message

   6: --There is insufficient system memory in resource pool 'default' to run this query.

If we increase the BufferCount and the MaxTransferSize to be very high, we would get into memory issues on the server.

Hopefully this information helps.

Posted in Backup, Performance, SQL Engine, SQL Server, Storage Engine | Leave a Comment »

Target_Recovery_Time – Indirect Checkpoint SQL Server 2012

Posted by Sourabh Agarwal on March 19, 2012

SQL Server 2012, introduces a new Database level option “Target_Recovery_Time” to better control and predict the time it takes during Recovery for the Database. The reason for this option was simple, the “Recovery Interval” option in sp_configure was really not reliable enough.

Before we talk about the new option, lets talk a bit about the older option.

With the Recovery interval option (set in minutes, default is 0), SQL Server would schedule the Database checkpoints in such a way that the time taken to recover the database is similar to what is specified in the Recovery Interval setting. What happens in the background is that SQL Server tries to schedule the checkpoint based on the number of DML operations that are taking place, but fails to take into consideration the number of DB Pages being modified by these operations.

As shown in the Video, SQL Server first calculates the Checkpoint interval for the Database number of DML operations and the value of “Recovery Interval” option.

When the first transaction completes (which modifies about 160 pages) and the subsequent fires, we flush all these 160 pages to the disk an move the Min_Recovery_LSN forward to the checkpoint LSN. This Min Recovery LSN is the what is used during the Database recovery, to Predict what operations are to be repeated in the Redo Phase of Recovery. During the Second Transaction (which is modifying close to 12K pages), if there were no problems, during checkpoint we would have flushed these 12K pages to the disk, which would result in a spike in the Disk Write Activity. Assuming that there was a crash, just before the Checkpoint happened (the transaction did commit), because of the large amount of pages which needs to be loaded and modified, the time to recover the database might vary and would be unpredictable.

The current Checkpoint algorithm induces the following problems, which are indeed the reason why the new Database was introduced.

  • Massive spikes in the Disk Write activity, during checkpoint.
  • Unpredictable recovery times for the Database.

The idea was to smooth out the disk write activity, so that we no longer see the massive spikes and to also have better predictability of the recovery time of a database.

The new database option can be set using the following Command.


   2: GO

or it can be set using the Database Properties window in SSMS.


When the new database option is set, SQL Server performs two things,

  • First it  introduces a background task to continuously flush the Dirty buffers from the disk. What SQL actually does is it calculates a min number of Dirty buffers it would keep. Every time this threshold is exceeded, the background process (called the background Recovery writer) kicks in and flushes the dirty buffers to the disk.
  • Continously move the Min_Recovery_LSN forwards to the LSN, which was most recently flushed. Since the Min_Recovery LSN is continuously moving forward, in case of a crash happens, the amount of work done during the Recovery would be small.

The video, below illustrates how this is happens.

With the new setting, the writes are no longer in spikes (smoothens out because of the background recovery writer) and the amount of work which needs to be done during the Recovery Phase is also reduced, providing more accurate estimations of the Recovery Time for the Database.

Word of Caution

If you set the Target_Recovery_Time to a very small number (like 1 sec) on an OLTP environment, SQL Server might just throttle the disk with the massive amount of Write operations which needs to be done as part of the Background Recovery Writes. Set this option, only after carefully examining the workload pattern and your SLA commitments. Most often the older “Recovery Interval” option from sp_configure works well.

I would like to thank my friend Parikshit Savjani, for his help with the Videos.

Posted in Performance, SQL Engine, SQL Server | Tagged: , | 2 Comments »

SQL Server 2012 — ColumnStore Index

Posted by Sourabh Agarwal on March 14, 2012

In the past, if you have been plagued with Slow performance of your data warehouse queries, there is some good news for you. With SQL Server 2012, we can now create ColumnStore Indexes, which would improve your query performance significantly. ColumnStore indexes are targeted for queries which use a Aggregations over columns or a set of columns.

Traditional SQL Server Approach

SQL Server has traditionally been using a row store based storage. Under this approach the complete data row is stored on a Data Page, and is read or written to as a complete unit. As shown below, the Row is Stored on the page


In the traditional approach, lets say if we had a table, with the given definition

create table t1
    a int,
    b int,
    c datetime,
    d varchar(300)

Assume we are running the following query against the table and that the table has about 1 billion records on approximately 40 million pages.

select SUM(a),AVG(b), MAX(c) from t1

In this case, we just need to do a sum of the values of column a. Given the row store approach, we would need to read the entire row (all the columns) to execute this query, which means we would need to read all the 40 million pages for this query and then perform the aggregation operation.

How ColumnStore Index Helps

When we create a column store index on the table, it changes the way how data is organized. Data for each column which is included in the ColumnStore index is arranged separately. For example if we created a ColumnStore index on the above table on the columns (a, b and c). The columns would be organized independent of each other on different set of pages. as shown below.


Additionally, ColumnStore uses page compression by default. Most of the column values are similar if not same, and hence can greatly benefit from compression.

Now, with this arrangement, if we run the above mentioned query, we would only need to read pages for Column a, which in the above case would be only about ~300K(assuming 40% compression) of pages to read and sum the values of a, then another ~300K(assuming 40% compression) pages for the average of b and finally ~500(assuming 50% compression) pages to calculate the Max of c.

As mentioned, this type of organization helps queries where we are returning only some of the columns from a table. For queries where we need to return all the columns in a row, Row-Store approach works better. Typically, OLTP workloads benefit from row storage organization of data, while warehouse queries would benefit from ColumnStore index.

Restrictions on ColumnStore Indexes

The MSDN Article describes the restrictions or limitations to creating ColumnStore indexes. One thing to keep in mind, that when u create a ColumnStore index on a Table, you cannot Update the table. In Order to update the table, the quickest way would be to drop the column store index and then perform your insert, update, deletes. Once you are done with the DML’s, the ColumnStore Index can be recreated.


I create two tables with the same schema but one with only a ColumnStore Index and another with a Clustered Index and a non clustered index. Both the table have about 7.7 million records.

This tables have been created on the AdventureWorks2012 sample database, which can be downloaded from here.


   1: use AdventureWorks2012

   2: go



   5: select * into dbo.TableWithoutColumnStoreIndex from Sales.SalesOrderDetail

   6: go


   8: declare @count int =0

   9: while @count < 6

  10: begin

  11:     insert into dbo.TableWithoutColumnStoreIndex 

  12:     ([SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  13:     [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate])

  14:         select [SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  15:     [UnitPriceDiscount],[LineTotal],NEWID(), [ModifiedDate] from dbo.TableWithoutColumnStoreIndex


  17:     set @count=@count+1

  18: end



  21: /****** Object:  Index [IDX_Clustered]    Script Date: 2/1/2012 3:01:42 AM ******/

  22: CREATE CLUSTERED INDEX [IDX_Clustered] ON [dbo].[TableWithoutColumnStoreIndex]

  23: (

  24:     [SalesOrderDetailID] ASC


  26: GO


  28: /****** Object:  Index [IDX_NonClustered]    Script Date: 2/1/2012 3:01:58 AM ******/

  29: CREATE NONCLUSTERED INDEX [IDX_NonClustered] ON [dbo].[TableWithoutColumnStoreIndex]

  30: (

  31:     [SalesOrderID] ASC,

  32:     [ProductID] ASC

  33: )


  35: GO




   1: use AdventureWorks2012

   2: go


   4: select * into dbo.TableWithColumnStoreIndex from Sales.SalesOrderDetail where 1=2

   5: go


   7: Alter Table dbo.TableWithColumnStoreIndex Alter Column LineTotal numeric(18,6)

   8: go


  10: insert into dbo.TableWithColumnStoreIndex 

  11: ([SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  12: [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate])

  13:     select [SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  14: [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate] from Sales.SalesOrderDetail

  15: go



  18: declare @count int =0

  19: while @count < 6

  20: begin

  21:     insert into dbo.TableWithColumnStoreIndex 

  22:     ([SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  23:     [UnitPriceDiscount],[LineTotal],rowguid, [ModifiedDate])

  24:         select [SalesOrderID],[CarrierTrackingNumber],[OrderQty],[ProductID],[SpecialOfferID],[UnitPrice],

  25:     [UnitPriceDiscount],[LineTotal],NEWID(), [ModifiedDate] from dbo.TableWithColumnStoreIndex

  26:     set @count=@count+1

  27: end



  30: -- Create a Column Store Index on the Table

  31: Create NonClustered ColumnStore Index IDX_ColumnStore on dbo.TableWithColumnStoreIndex

  32: (    [SalesOrderID],

  33:     [SalesOrderDetailID],

  34:     [CarrierTrackingNumber],

  35:     [OrderQty],

  36:     [ProductID],

  37:     [SpecialOfferID],

  38:     [UnitPrice],

  39:     [UnitPriceDiscount],

  40:     [LineTotal],

  41:     [ModifiedDate]

  42: )

I executed the below mentioned query on the two table to compare the performance. In the Query below, I am just substituting the Name of the tables([TableWithColumnStoreIndex], or  [TableWithoutColumnStoreIndex]) While executing.

   1: select

   2: ( ISNull(P.FirstName,'') + ' ' + ISNull(P.MiddleName,'') + ' ' + IsNull(P.LastName,'')) As CustomerName,

   3: SOH.OrderDate, SOH.DueDate, SOH.ShipDate,SOH.TotalDue,

   4: sum(TWC.OrderQty) As TotalOrderQuantity ,avg(TWC.UnitPrice) As AvgUnitPrice,

   5: Avg(TWC.UnitPriceDiscount) as AvgDiscountOnUnitPrice

   6: from 

   7: [dbo].[TableWithoutColumnStoreIndex] TWC 

   8: join Sales.SalesOrderHeader SOH on TWC.SalesOrderID = SOH.SalesOrderID

   9: join Sales.Customer C on SOH.CustomerID = C.CustomerID

  10: join Person.Person P on P.BusinessEntityID = C.PersonID

  11: where TWC.UnitPriceDiscount <> 0 and TWC.OrderQty > 500

  12: group by 

  13: ( ISNull(P.FirstName,'') + ' ' + ISNull(P.MiddleName,'') + ' ' + IsNull(P.LastName,'')),

  14: SOH.OrderDate, SOH.DueDate, SOH.ShipDate,SOH.TotalDue

Query Performance Against the Table without ColumnStore Index

When executed against the Table without ColumnStore index, the query table takes about 1248 millisecond.


Query Performance Against the Table with ColumnStore Index

When executed against the Table with ColumnStore Index, the query executes in about 218 milliseconds. A 5x performance improvement.


Posted in Performance, SQL Engine, SQL Server | Tagged: , | Leave a Comment »

Finding disk free space and other file level information for SQL Files

Posted by Sourabh Agarwal on March 13, 2012

Recently a friend of mine, Rahul Soni in his blog wrote a .Net code to retrieve the file information and other drive information for all SQL Server files.

While discussing this, he wanted to know if this can be done using T-SQL or not, and so the blog article follows.

Most of the file level information can be retrieved from the catalog View Sys.Master_files. To run queries against this view and see the records, the user would need the following permissions


The catalog view returns information about all the files for all the databases present in SQL Server. Details about the view can be found in SQL Server Books Online.

Additionally, to find the free space information on the Drives, I used to xp_fixeddrives extended stored procedure.

Included below is the code for the stored procedure which I wrote to get this information

   1: USE [master]

   2: GO

   3: /****** Object:  Stored Procedure [dbo].[usp_getFileinformation]    Script Date: 03/13/2012 20:13:15 ******/


   5: GO


   7: GO

   8: ALTER procedure [dbo].[usp_getFileinformation]

   9: As

  10: create table #fixeddrives

  11: (

  12: Drive varchar(2), 

  13: FreeSpaceMB int

  14: )


  16: insert into #fixeddrives(Drive,FreeSpaceMB) exec xp_fixeddrives

  17: select  

  18: DB_NAME(database_id) As DatabaseName, type_desc as FileType, 

  19: name as [FileName], physical_name as FileLocation, 

  20: (size*8)  as CurrentSizeInKB, 

  21: case    max_size

  22:      when -1 then null

  23:      else max_size

  24: end As MaxFileSize,

  25: Case  

  26:     when is_percent_growth = 1 then CAST(growth AS varchar(5)) + '%'

  27:     else CAST(growth AS varchar(5)) + ' MB'

  28: End FileGrowth,

  29: SUBSTRING(physical_name, 1,1) As Drive

  30: into #temp from master.sys.master_files


  32: select t.DatabaseName, t.FileType, t.FileName, t.FileLocation, t.CurrentSizeInKB, t.MaxFileSize,t.FileGrowth, 

  33: fd.FreeSpaceMB As FreeSpaceOnDrive_MB

  34: from  #temp t

  35: inner join #fixeddrives fd on fd.Drive = t.Drive

This stored procedure returns the following information.


Posted in Performance, SQL Engine, SQL Server | Tagged: , | Leave a Comment »

Did you Know- You can perform multiple Insert, update or delete operations as a Singleton operation using the SQL 2008 MERGE operator

Posted by Sourabh Agarwal on February 24, 2012

The Merge statement in SQL Server 2008 can be used to perform multiple inserts, update and delete operations. Assume for example, in your organization you have a Source Table and a destination table. Everyday at the end of the business, data from the Source table is appended to the Target table. With SQL 2000/SQL 2005, the only way of doing this optimally (assuming that the tables are very large) was to find the Delta of all the changes in the source table and then update the same in the target table. The problem was that you would have to write your own logic or use other SQL Features like SSIS or CDC for finding the delta and then updating the target table.

With the Merge statement you can do the same without much hassles. We can also use the Merge statement in an SSIS package or a job to automate the entire sequence.

Using the MERGE Statement

Syntax for the MERGE statement is simple to understand. First we have the Target table specified by the INTO clause, followed by the Source Table specified with the USING clause. Then we have the Join parameters. The join parameters are important, because this is what will govern which records are to be inserted or deleted or updated in the target table. Then we have the WHEN clause to define the criteria for INSERT, UPDATE and DELETE.

[ TOP ( expression ) [ PERCENT ] ]
[ INTO ] target_table [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias]
USING <table_source>
ON <merge_search_condition>
[ WHEN MATCHED [ AND <clause_search_condition> ]
THEN <merge_matched> ]
[ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]
THEN <merge_not_matched> ]
[ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]
THEN <merge_matched> ]
[ <output_clause> ]
[ OPTION ( <query_hint> [ ,...n ] ) ]

For example, assume that your organization maintains 2 tables, one for daily purchases and another for Reporting which has the last purchase date and the total amount of purchases made by each customer.

Use tempdb

Create table PurchaseReporting
CustomerID int,
LastPurchaseDate Datetime2,
Amount Smallmoney,

Create Table DailyPurchases
CustomerID int,
productID int,
Amount Money

Insert into PurchaseReporting values (1, GETDATE(), $1000),
(2,getdate(), $2000),
(3,getdate(), $500),
(4, GETDATE(), $700)

– In the Reporting table we have 4 records for 4 different customer to indicate
– the last purchase date and the total amount of purchased goods.

Insert into DailyPurchases values(1, 2, $100),

– In the Daily purchases table we have the 7 records to indicate the products purchased
– by different customers.

select * from PurchaseReporting
Select * from DailyPurchases


– Now lets write our Merge statement to update all the changes to the Reporting Table

MERGE PurchaseReporting as SR
(Select CustomerID, getdate(), Sum(Amount) from
    GROUP BY CustomerID) as DS(CustomerID, PurchaseDate, TotalAmount)
On SR.CustomerId = DS.CustomerID
UPDATE SET SR.LastPurchaseDate = DS.PurchaseDate, SR.Amount = (SR.Amount + DS.TotalAmount)
INSERT values (DS.CustomerID, DS.PurchaseDate, DS.TotalAmount);

–After execution lets check the values of the Reporting Table Again

select * from PurchaseReporting


Notice the changes in the LastPurchaseDate and the Amount columns for the customer (1,2,3,4) and a new record with CustomerID 5 being inserted into the table.

Posted in Performance, SQL Engine | Tagged: , , | Leave a Comment »


Get every new post delivered to your Inbox.

Join 81 other followers