Performance Tuning Re-indexing and Update Statistics – A Case Study

Recently we started experiencing a very strange issue in our
production reporting environment where the Re-indexing and Update Statistics
operation suddenly began taking more than 2 days to complete
and was thus causing blockage in the database which in turn caused
impairment in application performance.

Reporting Server
Configuration Details:


SQL Server Version:
SQL Server 2005 Enterprise Edition

Service Pack: 3

Windows Version:
Windows Server 2003 R2 Enterprise Edition

Number of CPU’s:
12

RAM: 16 GB

To troubleshoot this issue, I first separated the
Re-indexing and Update Statistics operations.

Index Re-building:

Originally the Re-indexing operation was performed using the
DBCC DBReindex command. When this command was executed against the database, it
drops and re-creates every index present in the tables of the
particular database which will take a considerable amount of time as well as
cause heavy usage of the Server Hardware resources.

As per best practice recommended by Microsoft there is
no need for us to perform the re-indexing of each and every index in
the tables of the database. If the fragmentation level of a particular index is
less than 30% then we can simply re-organize it, if it is greater than 30%
then we should elect for the re-building of the index.

 First we need to understand the difference between the re-build and re-organize operations. Index rebuild will
drop the existing index and re-create it again whereas index re-organize will physically re-organize the leaf node of the index. One
thing to remember here is that the rebuildng
operation is a very high resource utilization task.

 Based upon this, I
decided to modify the Re-indexing task as follows:

-- Ensure a USE  statement has been executed first.

SET NOCOUNT ON;
DECLARE @objectid int;
DECLARE @indexid int;
DECLARE @partitioncount bigint;
DECLARE @schemaname nvarchar(130);
DECLARE @objectname nvarchar(130);
DECLARE @indexname nvarchar(130);
DECLARE @partitionnum bigint;
DECLARE @partitions bigint;
DECLARE @frag float;
DECLARE @command nvarchar(4000);

-- Conditionally select tables and indexes from the sys.dm_db_index_physical_stats function
-- and convert object and index IDs to names.

SELECT
    object_id AS objectid,
    index_id AS indexid,
    partition_number AS partitionnum,
    avg_fragmentation_in_percent AS frag

INTO #work_to_do
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, 'LIMITED')
WHERE avg_fragmentation_in_percent > 10.0 AND index_id > 0;

-- Declare the cursor for the list of partitions to be processed.
DECLARE partitions CURSOR FOR SELECT * FROM #work_to_do;

-- Open the cursor.
OPEN partitions;

-- Loop through the partitions.
WHILE (1=1)

    BEGIN;
        FETCH NEXT
           FROM partitions
           INTO @objectid, @indexid, @partitionnum, @frag;

        IF @@FETCH_STATUS < 0 BREAK;
        SELECT @objectname = QUOTENAME(o.name), @schemaname = QUOTENAME(s.name)

        FROM sys.objects AS o
        JOIN sys.schemas as s ON s.schema_id = o.schema_id
        WHERE o.object_id = @objectid;
        SELECT @indexname = QUOTENAME(name)
        FROM sys.indexes
        WHERE  object_id = @objectid AND index_id = @indexid;
        SELECT @partitioncount = count (*)
        FROM sys.partitions
        WHERE object_id = @objectid AND index_id = @indexid;

-- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding.
        IF @frag < 30.0
              SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REORGANIZE';

        IF @frag >= 30.0
            SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REBUILD';

        IF @partitioncount > 1
                    SET @command = @command + N' PARTITION=' + CAST(@partitionnum AS nvarchar(10));

                            EXEC (@command);
        --PRINT N'Executed: ' + @command;

    END;

-- Close and deallocate the cursor.

CLOSE partitions;
DEALLOCATE partitions;

-- Drop the temporary table.

DROP TABLE #work_to_do;
GO

Once the above index rebuilding logic was implemented,
the query execution time reduced drastically from around 10-12 hours to a mere
2 hours 38 minutes which is an excellent improvement in performance.

Update Statistics:

The Update Statistics operation was performed using the
below syntax:

Exec sp_MSForEachtable ''update statistics ? with fullscan''

The above command indicates that each and every statistic
present in all the tables of the database is going to be updated with a
FULLSCAN.

During the execution of the above command, I noticed that
the query was taking a long long time to execute at a
particular stage and that a particular column belonging to a
NTEXT data type was the primary cause. Now let us
first understand, How NTEXT can impair performance.

Let us first create a simple table using the below T-SQL:

create table comment

(
comment_text NTEXT
)

After the table is created, let’s insert few
records into it using the below T-SQL:

insert comment
select 'ABC'
insert comment
select 'PQR'
insert comment
select 'XYZ'

Now let us check the Statistics IO count of the table named
Comment:

set statistics io on
select * from comment
set statistics io off

The Statistics IO
count is as follows:

logical reads 1,
physical reads 0,
read-ahead reads 0,
lob logical reads 6,
lob physical reads 0,
lob read-ahead reads 0

As you can see from the above T-SQL that though the value of
the Logical Reads is only 1 but the value of the LOB Logical Reads is 6 which is pretty high. This is an expected behaviour
as NTEXT never stores the data in the form of a table,
instead it stores it in the form of a large objects structure which in turn will
take a long long time to perform an Update Statistics
operation if done with FULLSCAN. At some stage, you will notice that if you
perform an Update Statistics operation with FULLSCAN it
may take as long as days to complete.

Also it should be kept
in mind that for a LOB column, when
the statistics are created the first 100 bytes and last 100 bytes will be read
to generate statistics. If we do not specify the sample rate for the UPDATE
STATISTICS command, SQL Server will calculate the sample rate based on the
number of rows and number of pages used by that table. In addition, the number
of BLOB pages will not be considered so it is possible that when we run UPDATE
STATISTIC a large volume of data will be sampled and 200 bytes of every BLOB row will be
read which generates a large I/O request and increases memory consumption.
Therefore it is always best that if you have a table in which there are columns
of data types TEXT,NTEXT, IMAGE,VARBINARY then always
perform the UPDATE STATISTICS operation without a FULLSCAN
. I updated the T-SQL
involved in performing the UPDATE STATISTICS operation as follows:

Update Statistics
without NTEXT and Image Data Type:

SET NOCOUNT ON
DECLARE @columnname VARCHAR(MAX)
DECLARE @tablename SYSNAME
DECLARE @statsname SYSNAME
DECLARE @sql NVARCHAR(4000)
DECLARE @NAME VARCHAR(MAX)
declare @i INT
declare @j INT

create table #temp
(
tablename varchar(1000),
statsname varchar(1000),
columnname varchar(1000)
)

insert #temp(tablename,statsname,columnname)
SELECT DISTINCT
OBJECT_NAME(s.[object_id]),
s.name AS StatName,

COALESCE(@NAME+ ', ', '')+c.name
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id

JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN INFORMATION_SCHEMA.COLUMNS D ON D.[COLUMN_NAME]= C.[NAME]
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]

WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND D.DATA_TYPE NOT IN('NTEXT','IMAGE')

create table #temp1
(
id int identity(1,1),
tablename varchar(8000),
statsname varchar(8000),
columnname varchar(8000)
)

insert #temp1(tablename,statsname,columnname)
select tablename,statsname,stuff(
(

    select ','+ [columnname] from #temp where
statsname = t.statsname for XML path('')

),1,1,'')
from (select distinct tablename,statsname from #temp )t
SELECT @i=1
SELECT @j=MAX(ID) FROM #temp1
WHILE(@I<=@J)
BEGIN

SELECT @statsname = statsname from #temp1 where id = @i
SELECT @tablename = tablename from #temp1 where id = @i
SELECT @columnname = columnname from #temp1 where id = @i

SET @sql = N'UPDATE STATISTICS '+QUOTENAME(@tablename)+QUOTENAME(@statsname)+space(1)+'WITH FULLSCAN'

--PRINT @sql
EXEC sp_executesql @sql  
SET @i = @i+1
END

DROP TABLE #temp
DROP TABLE #temp1

Update Statistics with
NTEXT and Image Data Type:

SET NOCOUNT ON
DECLARE @columnname VARCHAR(MAX)
DECLARE @tablename SYSNAME
DECLARE @statsname SYSNAME
DECLARE @sql NVARCHAR(4000)
DECLARE @NAME VARCHAR(MAX)
declare @i INT
declare @j INT

create table #temp
(
tablename varchar(1000),
statsname varchar(1000),
columnname varchar(1000)
)

insert #temp(tablename,statsname,columnname)
SELECT DISTINCT
OBJECT_NAME(s.[object_id]),

s.name AS StatName,
COALESCE(@NAME+ ', ', '')+c.name

FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN INFORMATION_SCHEMA.COLUMNS D ON D.[COLUMN_NAME]= C.[NAME]
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND D.DATA_TYPE  IN('NTEXT','IMAGE')

create table #temp1
(
id int identity(1,1),
tablename varchar(8000),
statsname varchar(8000),
columnname varchar(8000)
)

insert #temp1(tablename,statsname,columnname)

select tablename,statsname,stuff(
(
    select ','+ [columnname] from #temp where
statsname = t.statsname for XML path('')

),1,1,'')

from (select distinct tablename,statsname from #temp )t
SELECT @i=1
SELECT @j=MAX(ID) FROM #temp1

WHILE(@I<=@J)
BEGIN
SELECT @statsname = statsname from #temp1 where id = @i
SELECT @tablename = tablename from #temp1 where id = @i
SELECT @columnname = columnname from #temp1 where id = @i

SET @sql = N'UPDATE STATISTICS '+QUOTENAME(@tablename)+QUOTENAME(@statsname)

--PRINT @sql
EXEC sp_executesql @sql  
SET @i = @i+1
END

DROP TABLE #temp
DROP TABLE #temp1         

This is just one approach
to fine tuninjg the Re-indexing and Update Statistics operation on a production reporting environment. It has now been almost 2 months since the
above logic is in place and things are working absolutely fine. Please let me know if you have any comments or suggestions.




Array

No comments yet... Be the first to leave a reply!