SQL Server Performance

Alter index Rebuilds - sort_in_TempDB

Discussion in 'Performance-Related Article Discussions' started by joetig, Mar 20, 2009.

  1. joetig New Member

    I have an automated process that generates and runs the alter index rebuild statements for each index in my database based on fragmentation. It has been running fine for over one year. I am now looking into utilizing the SORT_IN_TEMPDB options since some of these tables are large. According to the documentation, you can utilize this option with the ONLINE = ON.
    I tried some tests on a stand alone db monitoring the tempdb usage. I set the trigger levels to capture tempdb data from the dmv's using sys.dm_db_file_space_usage when the freespace dropped below 98% and it did not change.
    I searched the web and came across this technical Article, SQL Server 2005 Online Index Operations. Reading the article it basically stated that using online operations would not trigger any activity in TEMPDB event if the SORT_IN_TEMPDB option was set.
    Does anybody know why this is this way. Why allow the combination of ONLINE and SORT_IN_TEMPDB both ON if it does nothing?
    The article mentioned that it does utilize TEMPDB when processing an non-clustered index with ONLINE = OFF. This I was able to verify.
  2. geebee2 New Member

    Not familiar with this, but it looks like SORT_IN_TEMPDB cannot be used when ONLINE=ON.
    ONLINE=ON is the more important option ( since it has a major effect on functionality, whereas SORT_IN_TEMPDB may just be faster ), and when it is specified SORT_IN_TEMPDB cannot be used, and is effectively ignored.
    [Later]
    http://msdn.microsoft.com/en-us/library/ms191261.aspx
    shows that ONLINE=ON implies a serperate copy of the index is built, so SORT_IN_TEMPDB is meaningless/redundant in this case.
  3. joetig New Member

    I am not sure why you would say that the SORT_IN_TEMPDB is meaningless. If you will notice that during the build phase there is a sort/merge step. The object is to improve performance and since rebuild of indexes online does create a new copy of the index, i thought that this option would help speed it up. I came across an excellant whitepaper that performed some testing and showed the rebuilds in action. It appears that it is utilizing the SORT_IN_TEMPDB option when performing non-clustered rebuilds with online = off. Again my question is, why would it use it only with online = off, when the index is unavailable to the client anyway and not when performing online. The help should specify if the options are mutually exclusive
    http://whitepapers.techrepublic.com.com/abstract.aspx?docid=350280
  4. geebee2 New Member

    joetig
    Apologies, you are right, I didn't read carefully enough.
    My guess is that online=on uses different code, and Microsoft didn't see the need to implement SORT_IN_TEMPDB for online=on, since speed is not critical in this case.
  5. geebee2 New Member

    Ah, the word document located http://technet.microsoft.com/en-gb/library/cc966402.aspx
    states
    "When both SORT_IN_TEMPDB and ONLINE are set to ON, the index transactions are stored in the tempdb transaction log, and the concurrent user transactions are stored in the transaction log of the user database. This allows you to truncate the transaction log of the user database during the index operation if needed. Additionally, if the tempdb log is not on the same disk as the user database log, the two logs are not competing for the same disk space."
  6. joetig New Member

    Yes, I did read that. But also in that same document there is a table that shows for each online rebuild, it has 0 in the sort runs. This is what I experienced in my tests and I was monitoring the TEMPDB space on a standalone machine and the free space did not move at all. So it shows online nonclusterd with 0 and offline non-clustered with 1.07. Same index, different results. That is my question and why isn't the help description more helpfull, you have to dig and search for this document to explain. That is basically would I am saying. I would like to take you for your responses [:)]
    Index OperationRatio of temporary space to the size of the index
    Single-column index8% of row size(4 bytes out of 46 bytes)Three-column index25% of row size(10 bytes out of 46 bytes)All-column index100% of row size(46 bytes out of 46 bytes)
    mapping indexsort runstotalmapping indexsort runstotalmapping indexsort runstotal
    Online clustered index create0.331.11.430.521.121.641.211.112.32
    Online clustered index rebuild0.1700.170.3100.311.0101.01
    Online nonclustered index create01.071.0701.051.0501.021.02
    Online nonclustered index rebuild000000000
    Offline clustered index create01.021.0201.021.0201.011.01
    Offline clustered index rebuild000000000
    Offline nonclustered index create or rebuild01.071.0701.051.0501.021.02
  7. geebee2 New Member

    Well I'm all at sea here, but is it that only CREATE requires sorting, online REBUILD does not do any sorting, because the original, already sorted, index is used?
    I should re-read everything, I did start by saying I'm not familiar with all this!
  8. joetig New Member

    Not a problem. I am no expert here either. That is why I posted the question. Trying to understand the process better.
  9. satya Moderator

    There is a catch in terms of TEMPDB log usage when SORT_IN_TEMPDB option is used, it doesn't change the fact that your data is "modified" in the database and those modifications are logged. So keepup these modified changes or information the tempdb log must be used extensively and same way applies to user database.
    When it is used the sort results are stored in the filegroup or partition scheme in which the resulting index is stored. Just to revisit on BOL reference that:
    ... tempdb is on a separate set of disks from the destination filegroup, during the first phase, the reads of the data pages occur on a different disk from the writes to the sort work area in tempdb. This means the disk reads of the data keys generally continue more serially across the disk, and the writes to the tempdb disk also are generally serial, as do the writes to build the final index. Even if other users are using the database and accessing separate disk addresses, the overall pattern of reads and writes are more efficient when SORT_IN_TEMPDB is specified than when it is not.
    I wanted to know whether you are after process of online index or issues with free space on disk?
  10. joetig New Member

    First of all, yes TEMPDB is located on seperate disks. I am not having any issues currently with the process. It has been running for about a year on the live site and reorganizing the indexes weekly. Since I am working with VLDB, I am always looking at ways to improve performance. I capture the fragmentation levels on Sunday and run the online rebuilds in 30 minutes chunks early in morning untill I have processed all the indexes. This usaually completes by Wednesday. The tables that cannot be done online are bypassed and only done during maintenance windows.
    We originally encountered issues with the online rebuilds related to parallel processing and worked with Microsoft the develop a fix. As far as the sort_in_tempdb goes, we looking at ways to improve the process since the database is constatntly growing. Looking at the help, it appearred that Sort_in_tempdb could be used to speed up the operations.
    I realize that my test machine only resembled the database size and but did not have any user activity during this time which is not like the live site. The technical document that I found stated that the online rebuilds would not use tempdb regardless and only the offline process would utilize tempdb when processing non clustered indexes. I am just trying to verify that this is the way it works.
    Since the database is part of a log shipping partnership, we are also hoping to cut down on the size of the log files that are shipped as well.

Share This Page