Performance Tuning Re-indexing and Update Statistics – A Case Study

Recently we started experiencing a very strange issue in our production reporting environment where the Re-indexing and Update Statistics operation suddenly began taking more than 2 days to complete and was thus causing blockage in the database which in turn caused impairment in application performance.

Reporting Server Configuration Details:
SQL Server Version: SQL Server 2005 Enterprise Edition
Service Pack: 3
Windows Version: Windows Server 2003 R2 Enterprise Edition
Number of CPU’s: 12
RAM: 16 GB

To troubleshoot this issue, I first separated the Re-indexing and Update Statistics operations.


Index Re-building:

Originally the Re-indexing operation was performed using the DBCC DBReindex command. When this command was executed against the database, it drops and re-creates every index present in the tables of the particular database which will take a considerable amount of time as well as cause heavy usage of the Server Hardware resources.

As per best practice recommended by Microsoft there is no need for us to perform the re-indexing of each and every index in the tables of the database. If the fragmentation level of a particular index is less than 30% then we can simply re-organize it, if it is greater than 30% then we should elect for the re-building of the index.

 First we need to understand the difference between the re-build and re-organize operations. Index rebuild will drop the existing index and re-create it again whereas index re-organize will physically re-organize the leaf node of the index. One thing to remember here is that the rebuildng operation is a very high resource utilization task.

 Based upon this, I decided to modify the Re-indexing task as follows:

-- Ensure a USE  statement has been executed first.
SET NOCOUNT ON;
DECLARE @objectid int;
DECLARE @indexid int;
DECLARE @partitioncount bigint;
DECLARE @schemaname nvarchar(130);
DECLARE @objectname nvarchar(130);
DECLARE @indexname nvarchar(130);
DECLARE @partitionnum bigint;
DECLARE @partitions bigint;
DECLARE @frag float;
DECLARE @command nvarchar(4000);
-- Conditionally select tables and indexes from the sys.dm_db_index_physical_stats function
-- and convert object and index IDs to names.
SELECT
    object_id AS objectid,
    index_id AS indexid,
    partition_number AS partitionnum,
    avg_fragmentation_in_percent AS frag
INTO #work_to_do
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, 'LIMITED')
WHERE avg_fragmentation_in_percent > 10.0 AND index_id > 0;
-- Declare the cursor for the list of partitions to be processed.
DECLARE partitions CURSOR FOR SELECT * FROM #work_to_do;
-- Open the cursor.
OPEN partitions;
-- Loop through the partitions.
WHILE (1=1)
    BEGIN;
        FETCH NEXT
           FROM partitions
           INTO @objectid, @indexid, @partitionnum, @frag;
        IF @@FETCH_STATUS < 0 BREAK;
        SELECT @objectname = QUOTENAME(o.name), @schemaname = QUOTENAME(s.name)
        FROM sys.objects AS o
        JOIN sys.schemas as s ON s.schema_id = o.schema_id
        WHERE o.object_id = @objectid;
        SELECT @indexname = QUOTENAME(name)
        FROM sys.indexes
        WHERE  object_id = @objectid AND index_id = @indexid;
        SELECT @partitioncount = count (*)
        FROM sys.partitions
        WHERE object_id = @objectid AND index_id = @indexid;
-- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding.
        IF @frag < 30.0
              SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REORGANIZE';
        IF @frag >= 30.0
            SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REBUILD';
        IF @partitioncount > 1
                    SET @command = @command + N' PARTITION=' + CAST(@partitionnum AS nvarchar(10));
                            EXEC (@command);
        --PRINT N'Executed: ' + @command;
    END;
-- Close and deallocate the cursor.
CLOSE partitions;
DEALLOCATE partitions;
-- Drop the temporary table.
DROP TABLE #work_to_do;
GO

Once the above index rebuilding logic was implemented, the query execution time reduced drastically from around 10-12 hours to a mere 2 hours 38 minutes which is an excellent improvement in performance.


Update Statistics:

The Update Statistics operation was performed using the below syntax:

Exec sp_MSForEachtable ''update statistics ? with fullscan''

The above command indicates that each and every statistic present in all the tables of the database is going to be updated with a FULLSCAN.

During the execution of the above command, I noticed that the query was taking a long long time to execute at a particular stage and that a particular column belonging to a NTEXT data type was the primary cause. Now let us first understand, How NTEXT can impair performance.

Let us first create a simple table using the below T-SQL:

create table comment
(
comment_text NTEXT
)

After the table is created, let’s insert few records into it using the below T-SQL:

insert comment
select 'ABC'
insert comment
select 'PQR'
insert comment
select 'XYZ'

Now let us check the Statistics IO count of the table named Comment:

set statistics io on
select * from comment
set statistics io off

The Statistics IO count is as follows:

logical reads 1,
physical reads 0,
read-ahead reads 0,
lob logical reads 6,
lob physical reads 0,
lob read-ahead reads 0

As you can see from the above T-SQL that though the value of the Logical Reads is only 1 but the value of the LOB Logical Reads is 6 which is pretty high. This is an expected behaviour as NTEXT never stores the data in the form of a table, instead it stores it in the form of a large objects structure which in turn will take a long long time to perform an Update Statistics operation if done with FULLSCAN. At some stage, you will notice that if you perform an Update Statistics operation with FULLSCAN it may take as long as days to complete.

Also it should be kept in mind that for a LOB column, when the statistics are created the first 100 bytes and last 100 bytes will be read to generate statistics. If we do not specify the sample rate for the UPDATE STATISTICS command, SQL Server will calculate the sample rate based on the number of rows and number of pages used by that table. In addition, the number of BLOB pages will not be considered so it is possible that when we run UPDATE STATISTIC a large volume of data will be sampled and 200 bytes of every BLOB row will be read which generates a large I/O request and increases memory consumption. Therefore it is always best that if you have a table in which there are columns of data types TEXT,NTEXT, IMAGE,VARBINARY then always perform the UPDATE STATISTICS operation without a FULLSCAN. I updated the T-SQL involved in performing the UPDATE STATISTICS operation as follows:

Update Statistics without NTEXT and Image Data Type:

SET NOCOUNT ON
DECLARE @columnname VARCHAR(MAX)
DECLARE @tablename SYSNAME
DECLARE @statsname SYSNAME
DECLARE @sql NVARCHAR(4000)
DECLARE @NAME VARCHAR(MAX)
declare @i INT
declare @j INT
create table #temp
(
tablename varchar(1000),
statsname varchar(1000),
columnname varchar(1000)
)
insert #temp(tablename,statsname,columnname)
SELECT DISTINCT
OBJECT_NAME(s.[object_id]),
s.name AS StatName,
COALESCE(@NAME+ ', ', '')+c.name
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN INFORMATION_SCHEMA.COLUMNS D ON D.[COLUMN_NAME]= C.[NAME]
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND D.DATA_TYPE NOT IN('NTEXT','IMAGE')
create table #temp1
(
id int identity(1,1),
tablename varchar(8000),
statsname varchar(8000),
columnname varchar(8000)
)
insert #temp1(tablename,statsname,columnname)
select tablename,statsname,stuff(
(
    select ','+ [columnname] from #temp where
statsname = t.statsname for XML path('')
),1,1,'')
from (select distinct tablename,statsname from #temp )t
SELECT @i=1
SELECT @j=MAX(ID) FROM #temp1
WHILE(@I<=@J)
BEGIN
SELECT @statsname = statsname from #temp1 where id = @i
SELECT @tablename = tablename from #temp1 where id = @i
SELECT @columnname = columnname from #temp1 where id = @i
SET @sql = N'UPDATE STATISTICS '+QUOTENAME(@tablename)+QUOTENAME(@statsname)+space(1)+'WITH FULLSCAN'
--PRINT @sql
EXEC sp_executesql @sql
SET @i = @i+1
END
DROP TABLE #temp
DROP TABLE #temp1

Update Statistics with NTEXT and Image Data Type:

SET NOCOUNT ON
DECLARE @columnname VARCHAR(MAX)
DECLARE @tablename SYSNAME
DECLARE @statsname SYSNAME
DECLARE @sql NVARCHAR(4000)
DECLARE @NAME VARCHAR(MAX)
declare @i INT
declare @j INT
create table #temp
(
tablename varchar(1000),
statsname varchar(1000),
columnname varchar(1000)
)
insert #temp(tablename,statsname,columnname)
SELECT DISTINCT
OBJECT_NAME(s.[object_id]),
s.name AS StatName,
COALESCE(@NAME+ ', ', '')+c.name
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN INFORMATION_SCHEMA.COLUMNS D ON D.[COLUMN_NAME]= C.[NAME]
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND D.DATA_TYPE  IN('NTEXT','IMAGE')
create table #temp1
(
id int identity(1,1),
tablename varchar(8000),
statsname varchar(8000),
columnname varchar(8000)
)
insert #temp1(tablename,statsname,columnname)
select tablename,statsname,stuff(
(
    select ','+ [columnname] from #temp where
statsname = t.statsname for XML path('')
),1,1,'')
from (select distinct tablename,statsname from #temp )t
SELECT @i=1
SELECT @j=MAX(ID) FROM #temp1
WHILE(@I<=@J)
BEGIN
SELECT @statsname = statsname from #temp1 where id = @i
SELECT @tablename = tablename from #temp1 where id = @i
SELECT @columnname = columnname from #temp1 where id = @i
SET @sql = N'UPDATE STATISTICS '+QUOTENAME(@tablename)+QUOTENAME(@statsname)
--PRINT @sql
EXEC sp_executesql @sql
SET @i = @i+1
END
DROP TABLE #temp
DROP TABLE #temp1

This is just one approach to fine tuninjg the Re-indexing and Update Statistics operation on a production reporting environment. It has now been almost 2 months since the above logic is in place and things are working absolutely fine. Please let me know if you have any comments or suggestions. ]]>

Leave a comment

Your email address will not be published.