SQL Server Index Maintenance Performance Tuning for Large Tables

It is nothing really new that indexes have their right to exist in SQL Server. It is also nothing really new, that these indexes need to be maintained every now and then, even though this might be surprising for people who are lulled by the otherwise low maintenance for SQL Server. But if you think about it carefully, you’ll realise, that index maintenance is necessary and that it makes sense, that SQL Server is not doing this task automatically blindly according to the book.

The various environments in which SQL Server is used are simply too different to lump them all together with respect to index maintenance. What might have significant effects on the performance in one environment might be acceptable in another environment, where completely different queries are run, for which a maybe sub optimally maintained index is still “good enough”.

But even the word “optimal” in the previous sentence can be subject of great controversy. What is “optimal”? Microsoft recommends reorganising an index with a fragmentation between 5% and 30%, and rebuilding an index with a fragmentation of more than 30%. You can read this here. So far, so good. But as with every recommendation one should question them, whether they still make sense in one’s own environment or not. This is especially true, when you read what high-flying rocket science was applied while determining these numbers: Where do the Books Online index fragmentation thresholds come from? :-)

In our environment we have one table, into which we insert between 10 and 20 million rows daily. Every night we run a maintenance procedure that deletes several million rows that are older than x days. So, the net increase for this table is between 3 – 5 million rows every day. Currently this table contains 975 million rows.

Determining the fragmentation has become quite easy since SQL Server 2005. You can simply use the dynamic management function sys.dm_db_index_physical_stats. Here is a part of the script with which we started off one day:

SELECT
    OBJECT_SCHEMA_NAME(FRAG.[object_id]) + '.' + OBJECT_NAME(FRAG.[object_id]),
    SIX.[name],
    FRAG.avg_fragmentation_in_percent,
    FRAG.page_count
FROM
    sys.dm_db_index_physical_stats
    (
        DB_ID(),    --use the currently connected database
        0,          --Parameter for object_id.
        DEFAULT,    --Parameter for index_id.
        0,          --Parameter for partition_number.
        DEFAULT     --Scanning mode. Default to "LIMITED", which is good enough
    ) FRAG
    JOIN
    sys.indexes SIX ON FRAG.[object_id] = SIX.[object_id] AND FRAG.index_id = SIX.index_id
WHERE
    --don't bother with heaps, if we have these anyway outside staging tables.
    FRAG.index_type_desc <> 'HEAP' AND
    (
    --Either consider only those indexes that need treatment
    (FRAG.page_count > @IPageCnt AND FRAG.avg_fragmentation_in_percent > @IMinFragmentation)
    OR
    --or do everything when it is MaintenanceDay
    @IsMaintenanceDay = 1
    )
ORDER BY
    FRAG.avg_fragmentation_in_percent DESC;

The rows that are returned are inserted into a temporary table and are used one after the other. After each run we determine whether we can start another operation in our daily maintenance window or not.
In the WHERE clause we filter for several criteria:

  • @IPageCnt: Default value is 128. Anything below this threshold we ignore. The benefits here do not justify the efforts.
  • @IMinFragmentation: This defaults to 10%. Anything below 10% is ignored. We couldn’t observe any significant performance hit with fragmentation levels < 10% given our workload.
  • @IsMaintenanceDay: Once a week we have a maintenance window, in which we can maintain all indexes. At least we thought so, when we started…

Depending on the level of fragmentation that was returned, we either reorganised or rebuilt the indexes, as it was suggested by the above MS recommendation. This was all fine until we came into the office one Monday morning, just to find out that our index job ran for more than 5 hours and that the scheduled backup job wasn’t really amused about it at all.

Continues…

Pages: 1 2




Related Articles :

26 Responses to “SQL Server Index Maintenance Performance Tuning for Large Tables”

  1. Hi,

    Nice article.

    You can discover a lot more about improving SQL performance via DMVs in this new book http://www.manning.com/stirk. It contains more than 100 scripts to identify problems, and offers a wide range of solutions.

    Chapter 11 contains various scripts for automating index maintenance.

    Chapters 1 and 3 (indexes) can be downloaded for free. Chapter 1 includes scripts for:

    A simple monitor
    Finding your slowest queries
    Find your missing indexes
    Identifying what SQL is running now
    Quickly find a cached plan

    Thanks
    Ian

  2. Good article but have you got the scripts for your table schemas ?

  3. Interesting article and it was interesting to learn about the lack of predicate push-back.

    One thing that can help with index rebuild times is using the SORT_IN_TEMPDB option (assuming that tempdb is on it’s own array) We have a 2 billion row table that I performed a test on. Using this option I got the rebuild of 14 indexes down from 15 hours to 6.5 hours.

    Additionally, with very large tables, it’s also worth considering partitioning them. Apart from the other benefits that partitioning brings, if the data being populated into the table is naturally partitioned by date then you only need to worry about re-indexing a single partition (unless you get a lot of data modifications to old data). Plus, the sys.dm_db_index_physical_stats function allows you to specify the partition number, which speeds that up alot. As a result of this, I’m currently able to re-index 14 indexes of a 2 billion row table every night (because I’m only rebuilding the current-month partition).

    • Hi Karl,
      yes, we had partitioning in mind right from the start, but that was no real need for it so far. Performance is still good and only recently the business has made its mind up on a data retention policy. As it stands now we can throw away most rows after 1 month, so again no urgent need for partitioning.

  4. I got to learn new things from this article. However, I am using below given script for checking index fragmentation adn it is working fine for me. thihs checks fragmentation of large tables only (you can define large in context of your environment.)

    create table #ROW_COUNT
    (
    name varchar(2000),
    rows int,
    reserved varchar (2000),
    data varchar(2000),
    index_size varchar(2000),
    unused varchar(2000)
    )
    SET NOCOUNT ON
    DECLARE @tablename varchar(2000)
    DECLARE fetch_table_name CURSOR FOR
    SELECT name FROM sys.objects where type like ‘U’
    OPEN fetch_table_name
    FETCH NEXT FROM fetch_table_name INTO @tablename
    while @@fetch_status = 0
    BEGIN
    insert into #ROW_COUNT (name,rows,reserved,data,index_size,unused)
    exec sp_spaceused @tablename
    FETCH NEXT FROM fetch_table_name INTO @tablename
    END
    CLOSE fetch_table_name
    DEALLOCATE fetch_table_name
    SELECT name, rows, CONVERT (int,(left(data,CHARINDEX(‘ ‘,data)))) data
    INTO #DATA_CONVERTED
    FROM #ROW_COUNT
    select object_name(a.object_id) TableName, b.name Index_Name
    , a.avg_fragmentation_in_percent, a.page_count, c.rows Number_Of_Rows_In_Table, c.data Amount_Of_Data_In_Table_In_KB
    from sys.dm_db_index_physical_stats(10, null, NULL, NULL , NULL) a,
    sys.indexes b, #DATA_CONVERTED c
    where (a.index_id=b.index_id and a.object_id=b.object_id)
    and object_name(a.object_id)=c.name
    and a.index_type_desc ‘heap’
    and a.avg_fragmentation_in_percent>70
    and c.rows>5000
    and c.data>25600
    and b.type_desc ‘heap’
    SET NOCOUNT OFF
    drop table #ROW_COUNT
    drop table #DATA_CONVERTED

  5. We went through a similar thought and devleopment process and arrived at a simple conclusion. In most cases it took just about as long to determine the level of fragmentation as it did to just run the Alter Index ReOrg/Rebuild. So during our regular maintenance, we just run the Alter index on most tables using boundaries similar to 100 < #pages < [Your Max Limit]. The process cycles through the tables, keeps track of what it does, and then resumes each night at the place it stopped yesterday.
    We still debate Reorg/Rebuild. Each has advantages and disadvantages. :)

  6. i’ve noticed that if you have a newer server like a Proliant DL 380 G5 or newer with 32GB of RAM, 2 quad core CPU’s and an OK I/O configuration then you can have a silent maintenance window because no one will notice that it’s running.

    we run 24×7 and no one ever complains about SQL during maintenance times. what you can also do is change the fragmentation level to 20% or so in order to hit less indexes but run maintenance more often.

  7. You should try ola hallengren index optimize tool and specify a time for some time in seconds if you only have.some time for this task
    Ola.hallengren.com

  8. Hi Frank,

    Nice and useful article.
    Only one comment, 2005 accept 2000 mode (as you already now). My suggestion is to specify, in that case, don’t work.

    Cheers,

    Luis

  9. This is a very well written and I very much appreciate the process from start to finish and your reasoning along the way.

    One approach you could take is to run the “sys.dm_db_index_physical_stats” process and store the results into a table.
    You could start this before your maintenance window and then have your index job check the table.
    The “sys.dm_db_index_physical_stats” process can run really anytime, day before, hour before,…
    The fact that it takes ~30min to run isn’t the issue, you would just have to start this earlier and have the table persist which in at least one scenario you already have.
    This is just one additional approach not any better than yours.

    Unfortunately, partitioning is only supported in enterprise and I am forced to stick with standard :(. Unless someone knows something I don’t.
    I have even tossed the idea of using tables by year and views with triggers to control which table the insert/update requires. A mock seems to work well; I just can’t get company to test with real data.

    • Hi Matt,
      I must confess, I haven’t thought about running sys.dm_db_index_physical_stats as a separate step outside the maintenance window and then “just” pick-up that results during the maintenance window. I guess that’ll give me food for thoughts now. Thanks!

  10. Frank,

    Thanks for a ‘real-world’ article. It was thought-provoking since I have a similarly large table in our syste. My questions is this. You state in the article that ‘we determine the number of processors upfront, and only use half of them in the ALTER INDEX statement’. I’m not sure how you did that. Is that outside the scope of the article or can you give us a hint about how you accomplished that? The implication was that you forced the ALTER INDEX to use only a subset of the CPUs. How was that accomplished?

    Willem

  11. Hi everyone.

    I have read an interesting article somewhere recently about using sys.dm_db_index_operational_stats for indirectly determine fragmentation in very large tables.

  12. I argue the fact that indexes on tables with lots of inserts and deletes are not well maintained with just REORGANIZE or REBUILD. I made a huge script that drops and creates each of the indexes on these tables once every two weeks. I run REBUILD on ALL and update statistics on indexes twice a week.

    These jobs run in under 2 hours on all my dbs, and the indexes are always being used, and never get too fragmented.

    Soon I will have a script that drops and creates all indexes which I can run every 2 or 3 months.

    • Hi Colin,
      I’m not sure I understand you fully. We had the idea of not bothering with detecting fragmentation on big tables at all, and go straight for a drop and (re)create, but never really investigated this much further when we implemented this. Since then, I thought about this as well and it is on my to-do list once we have reached our year-end change freeze. I guess that’ll go into a follow-up article to this one.

      How big are your tables? And how many rows are inserted or deleted on average daily?

  13. This is one of the best answer so far, I have read online.Just useful information. Very well presented. I had found another good collection of index in sql server over internet.
    please check out this link…
    http://mindstick.com/Articles/f7e21f40-d2f3-485a-b2aa-53093bac5ff7/?Index%20in%20SQL%20Server

    Thanks

  14. Thanks for the article Frank – nice to see your approach on this.

    We have got a similar, although slightly larger, issue to the 30 minute duration you talked about for the DMF – we have a 2 TB database, on a DL380 G6 (12 core) with 72 GB ram, where running the DMF in LIMITED mode takes 9h00+. SAMPLED was kiled after 26 hours, and DETAILEd – let’s not even go there (36 hours +). Large amount of data load activity takes place on the server (it is a performance data repository).

    Given it is 2 TB in size, we don’t just want to reindex it all, but we do note that performance degrades if maintenance fails.

    Is there any other metric peristed, even if it is somewhat out-of-date by thr time we do maintenance on indexes, or do we have to activel;y ask SQL to find out the fragmentation of an index by reviewing the strucutre (i.e. either SHOWCONTIG or the DMF?)

    I’d like to avoid reindex 200 GB tables if they don’t need it, but we can’t afford 10 hours just to find out if we should do maintenance.

    • One thing that comes into mind, might be to look into partitioning to split such beasts into manageable units. Apart from this… I honestly don’t know. Sometimes I think SQL Server is more sensitive to outdated statistics than it is to fragmentation. We had several cases where updating statistics fixed time-outs on the client with fragmentation staying the same.

      I think this is more of a black art than a science. :-)

  15. Frank, I too ran into a similar issue. We currently defrag smaller tables throughout the day based on the number of pages and level of fragmentation. We had found it took longer to determine fragmentation than to actually reorganize or rebuild the indexes. I ran across a post by Paul Randal that described how to create a wrapper function for sys.dm_db_index_physical_stats that can be used with cross apply. So we now determine the page count of the indexes using sys.dm_db_partition_stats and then from there we cross apply to the wrapper function. Its been working like a charm for about 6 months now and has dramatically reduced our maintenance time for these small indexes. Check out the link below for more info. Hope it helps!

    http://www.sqlskills.com/blogs/paul/post/Indexes-From-Every-Angle-Using-CROSS-APPLY-with-sysdm_db_index_physical_stats.aspx

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |