SQL Server Index Maintenance Performance Tuning for Large Tables

It is nothing really new that indexes have their right to exist in SQL Server. It is also nothing really new, that these indexes need to be maintained every now and then, even though this might be surprising for people who are lulled by the otherwise low maintenance for SQL Server. But if you think about it carefully, you’ll realise, that index maintenance is necessary and that it makes sense, that SQL Server is not doing this task automatically blindly according to the book.

The various environments in which SQL Server is used are simply too different to lump them all together with respect to index maintenance. What might have significant effects on the performance in one environment might be acceptable in another environment, where completely different queries are run, for which a maybe sub optimally maintained index is still “good enough”.

But even the word “optimal” in the previous sentence can be subject of great controversy. What is “optimal”? Microsoft recommends reorganising an index with a fragmentation between 5% and 30%, and rebuilding an index with a fragmentation of more than 30%. You can read this here. So far, so good. But as with every recommendation one should question them, whether they still make sense in one’s own environment or not. This is especially true, when you read what high-flying rocket science was applied while determining these numbers: Where do the Books Online index fragmentation thresholds come from? 🙂

In our environment we have one table, into which we insert between 10 and 20 million rows daily. Every night we run a maintenance procedure that deletes several million rows that are older than x days. So, the net increase for this table is between 3 – 5 million rows every day. Currently this table contains 975 million rows.

Determining the fragmentation has become quite easy since SQL Server 2005. You can simply use the dynamic management function sys.dm_db_index_physical_stats. Here is a part of the script with which we started off one day:

    OBJECT_SCHEMA_NAME(FRAG.[object_id]) + '.' + OBJECT_NAME(FRAG.[object_id]),
        DB_ID(),    --use the currently connected database
        0,          --Parameter for object_id.
        DEFAULT,    --Parameter for index_id.
        0,          --Parameter for partition_number.
        DEFAULT     --Scanning mode. Default to "LIMITED", which is good enough
    ) FRAG
    sys.indexes SIX ON FRAG.[object_id] = SIX.[object_id] AND FRAG.index_id = SIX.index_id
    --don't bother with heaps, if we have these anyway outside staging tables.
    FRAG.index_type_desc <> 'HEAP' AND
    --Either consider only those indexes that need treatment
    (FRAG.page_count > @IPageCnt AND FRAG.avg_fragmentation_in_percent > @IMinFragmentation)
    --or do everything when it is MaintenanceDay
    @IsMaintenanceDay = 1
    FRAG.avg_fragmentation_in_percent DESC;

The rows that are returned are inserted into a temporary table and are used one after the other. After each run we determine whether we can start another operation in our daily maintenance window or not.
In the WHERE clause we filter for several criteria:

  • @IPageCnt: Default value is 128. Anything below this threshold we ignore. The benefits here do not justify the efforts.
  • @IMinFragmentation: This defaults to 10%. Anything below 10% is ignored. We couldn’t observe any significant performance hit with fragmentation levels < 10% given our workload.
  • @IsMaintenanceDay: Once a week we have a maintenance window, in which we can maintain all indexes. At least we thought so, when we started…

Depending on the level of fragmentation that was returned, we either reorganised or rebuilt the indexes, as it was suggested by the above MS recommendation. This was all fine until we came into the office one Monday morning, just to find out that our index job ran for more than 5 hours and that the scheduled backup job wasn’t really amused about it at all.


Leave a comment

Your email address will not be published.