THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Merrill Aldrich

Visualizing Data File Layout II

Part 2 of a blog series visually demonstrating the layout of objects on data pages in SQL Server

Part 1

In Part 1 of this series, I introduced a little demo app that renders the layout of pages in SQL Server files by object. Today I’ll put that app through its paces to show, in vivid color (well, teal, anyway) the destructive power of the famous Re-Index Then Shrink anti-pattern for index maintenance.

This one is very easy to demo, so let’s go!

First, I created a demo database VizDemo1, with a single 200 MB data file. Into that database I placed a canonical table – highly simplified for this example – clustered on an ever-increasing integer, using identity():

USE VizDemo1
GO

CREATE TABLE dbo.SampleCustomers  ( 
    id int identity( 1, 1 ) NOT NULL PRIMARY KEY CLUSTERED, 
    buncha char(500) default 'A', 
    big char(500) default 'B', 
    vals char(500)  default 'C'
);
GO

Then we populate that table with some dummy data:

INSERT dbo.SampleCustomers DEFAULT VALUES;

GO 40000

And finally, fire up the little visualizer app and process the database:

VizDemo1_01

The small color bands at the top left corner of the image are the system tables and such that are in every “empty” database to make it run. The blue/green/teal area is the new table we created and populated with sample data, and the gray area represents empty regions in the file.

As expected, the table started writing into the first available space, and, because the cluster key is increasing, pages were allocated to the end of the page sequence in order, and we end up with a crazy-perfect, contiguous linked list on disk.

You can see small darker bars at intervals within the table – most of the pages in the index are “type 1” pages, which are the leaf-level/rows in the clustered index. Those bars are “type 2” index pages that have the upper level(s) of the index. The reason they are darker is that those are a disruption in the leaf level linked list, and the app shades such disruptions as a way to see fragmentation. The list has to “hop over” those pages and then continue on the other side. It’s technically fragmentation, but at this point not harmful at all – but remember that darker color that shows a break in the page order.

A side note: in the midst of the gray area you can see one orange line (and another in the sea of teal). Those are “type 11” PFS pages, which happen on a fixed interval in every file. I don’t think they ever move – they track file allocation and free space metadata. They are like rocks in the stream…

Now, what happens if we re-index this bad boy? Well, a re-index operation has to write all the pages for the object into new, blank pages in the file, and then abandon the old pages. I run:

-- This "moves" all the data toward the end of the file, into free areas
ALTER INDEX ALL ON dbo.SampleCustomers REBUILD;
GO

Then re-analyze the file. As expected, the table has “moved” toward the end of the file, and left free space toward the beginning. It’s still not fragmented, because we had enough room, and it was written in order into that new area by the rebuild:

VizDemo1_02

We can see the gray area near the top is all the “abandoned” pages where the index was, and the data has all moved down into the free area. Ah, but that seems wasteful to some people, am I right? All that empty space – the file could be smaller!

Let’s see the damage that Shrink File does. Imagine that I do this:

DBCC SHRINKFILE (N'VizDemo1' , 70)
GO

First, before we shrink, let’s just scroll down and look at the end of the file:

VizDemo1_03

We have two conditions – the gray part of the file is sort of OK to shrink. There’s just a lone PFS page out there, and removing that does no harm. But once we get into that blue area, the data has to be moved back up into the beginning of the file. Here’s where the problem lies, as I learned from Mr. Paul Randal – the shrink routine will move a page at a time back into that free space, starting from the end, going backward. That makes the pages land in approximately reverse order from the correct index order. Perfect fragmentation. Let’s see if this tool proves him right. Shrink, then re-analyze:

VizDemo1_04

Yep, it’s not immediately apparent, perhaps, but that teal color is a darker shade that indicates every page is a fragment boundary in most of the index – perfect fragmentation! Here’s a better view:

VizDemo1_04a

So, how can we clean that up? Well, with a rebuild. But … we need that bigger file. In fact, practically any database in production needs this overhead of available space to be able to perform index maintenance. It’s not “wasted” space at all.

ALTER INDEX ALL ON dbo.SampleCustomers REBUILD;
GO

After the rebuild, the index is back toward the end of the file, but it’s also back in order:

VizDemo1_05

Now, in light of this information, imagine nightly re-indexing on a database with … AutoShrink! <shudder>

Published Wednesday, January 23, 2013 4:40 PM by merrillaldrich

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Wolfgang said:

Hello there,

this is a very great article. Short, easy to understand and really interesting.

The only thing I cannot agree with is that you call the rebuild-overhead a not "wasted" space. As so often it depends:

In our production system some of our tables need some hundreds GB(!) space on a high availability storage. As this is really expensive it IS "wasted" space for us. That's why we do a reorg instead of a rebuild.

We already thought rebuilding works like you showed here but it's great to get the acknowledgment.

Keep on writing :-)

January 28, 2013 4:28 AM
 

merrillaldrich said:

Thanks! True, a database with one or a few very large tables can pose a challenge because so much more overhead space is needed if you reindex, compared to a database with many smaller tables. The two ways I know to mitigate are partitioning (individual partitions can be rebuilt) or, as you say, reorg. If partitioning isn't an option, I still prefer to have the overhead available - imagine, for example, if you needed to make a change to the CI.

January 28, 2013 8:37 AM
 

Merrill Aldrich said:

This is part three of a blog series illustrating a method to render the file structure of a SQL Server

January 29, 2013 12:45 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement