In the first part of this series, I had discussed about the regular record structure used in SQL Server. In this post I would be talking about a special case, Row Forwarding and how they effect the record structures in SQL.
In order to understand Row Forwarding, it’s important to first understand how non clustered indexes created on a Heap table works.
in case of heap tables, non-clustered indexes on their leaf pages have the RID along with the index key values. This RID value helps link the non-clustered index to the heap table during a scan or a seek. Consider the following example. This RID value is combination of PageID:RowID identifying a physical record in the table.
Assume we have a heap table with the following columns (Col1, Col2, Col3, Col4, Col5). Also assume there is a non-clustered index on the column (col1).
The index leaf page would have a similar structure (this is just an illustration!)
now assume if we are running the following query, against the table
Select Col1, Col2, Col3 from Table1
where Col1 = SomeValue
this query would can potentially use the Non Clustered Index (depending on whether the cost of using the index is lesser than table scan on not). If the query uses the NCI, then it can get the values of Col1 quiet easily, but for the values of the columns Col2 and Col3, it has to piggy back on the RID value to reach the actual data record in the table (PageID and RowID) and get the values from there.
This operation in SQL is called the Bookmark Lookup (SQL 2000) or the Lookup Operation (SQL 2005 and onwards).
Now assume we have a table with some Variable length columns. When the values in the variable length columns are updated, the update might result in an increase in the size of the column. SQL server might not be able to fit this new record on the same page and may cause a Page Split, thereby moving the current record and potentially other records on the page to a different page.
Now if there was a Non-clustered index on this table, then the non-clustered index would have to able to be modified to reflect movement of the rows. This would make the update operations very expensive.
So instead of having to update the non-clustered index, SQL server creates a pointer/stub at the initial location of the record to point to the new address of the record. This way, when the NCI scan or seek reaches the record, it simply reads the pointer record and reach the new location of the Row. The pointers are called forwarding Pointers/records and the actual record is called the Forwarded Records.
The same record can be modifies multiple times and a new forwarded record might have to be created. This could potentially lead to having a chain of forwarding/forwarded records. In reality this does not happen. What actually happens is that the Forwarded Record also contains a back-pointer to the forwarding record. So when the multiple changes are being made to the record, the engine just takes the new location of the record and updates the original forwarding record, to point to this new location.
Row Forwarding is bad for performance. Also, row forwarding only happens in HEAP Tables.
Example: Examining the Forwarding Record and Forwarded Record
CREATE TABLE ForwardingRecord
Col1 int NOT NULL,
Col2 char(1000) NOT NULL,
Col3 varchar(3000) NULL,
Col5 varchar(4100) NOT NULL
Insert into ForwardingRecord values (1, Replicate('a',1000), replicate('b',1000),replicate('b',1000))
Insert into ForwardingRecord values (2, Replicate('a',1000), replicate('b',1000),replicate('b',1000))
The result of the DBCC Page, would show that there are two records on the page. Each record is about 3017 bytes in size.
now lets update the second record in just a way that it causes a Page Split.
Update ForwardingRecord set Col5 = Replicate('v', 4100) where col1=2
Dumping the same page again and looking at the Slot 1 (the original record was here) we see,
As can be seen, the forwarding record has the information about where the forwarded record exists. Let’s now try to dump the Page 280 and check the record.
as can be seen that the Forwarded record has information about the forwarding record. As Paul Randal mentions in this blog, the back pointer is 10 bytes in size.