SQL Server Performance

Difference bet. 2000 vs. 2005 Stored Procedure

Discussion in 'SQL Server 2005 T-SQL Performance Tuning' started by eriklang, Feb 9, 2007.

  1. eriklang New Member

    I had a stored procedure on my SQL Server 2000 box that took about 30 minutes to run. When i ported it over to my new SQL Server 2005 test box, it takes about 16 hours to run. The stored procedure basically creates a temp table w/ about 2 million records with all the available accounts. It then creates a 2nd temp table w/ all the active accounts. Finally, it uses a delete statement w/ a simple subquery to delete the in-active accounts. Its during the deleting part of the procedure that the system seems to take the longest but it eventually completes it. Here is the delete statement:

    Delete #dr
    From #dr dr
    Where
    dr.DirstatPolState+dr.DirstatPolAcctNbr+
    dr.PolSerialNbr+dr.XdirstatPolicyYear
    NOT IN
    (Select ip.DirstatPolState+ip.DirstatPolAcctNbr+ip.PolSerialNbr+ip.XdirstatPolicyYear
    From #included_policies ip)

    Any help w/ why it is taking so long would be greatly appreciated.
  2. MohammedU New Member

    Are these two servers have the same HW/memory?

    Did you update statistics/reindexed the db after upgrade?

    When you are creating #dr and #included_policies table create coputed column for (dr.DirstatPolState+dr.DirstatPolAcctNbr+ dr.PolSerialNbr+dr.XdirstatPolicyYear ) columns...


    MohammedU.
    Moderator
    SQL-Server-Performance.com
  3. Adriaan New Member

    The concatenation method produces a query statement in less time than it would take you to write the proper query.

    The query itself, however, will probably take much more execution time than the proper query:

    DELETE FROM #dr
    WHERE NOT EXISTS
    (SELECT ip.* FROM #included_policies ip
    WHERE #dr.DirstatPolState = ip.DirstatPolState
    AND #dr.DirstatPolAcctNbr = ip.DirstatPolAcctNbr
    AND #dr.PolSerialNbr = ip.PolSerialNbr
    AND #dr.XdirstatPolicyYear = ip.XdirstatPolicyYear)

    For further improvement, I would add indexes on the listed columns in both temp tables, exactly in the order in which they're mentioned in this query.

    But why are you not using the same WHERE NOT EXISTS clause when inserting into #dr? That would save you lots more time - deletion is always a relatively slow operation.
  4. mpex04 New Member

    SQL Server 2005 seems to have a bug in the database engine when using the NOT IN clause. I have many examples where the new server that we purchased will not complete a query that runs in a few seconds on the older server. For example, I'm currently working on a migration to 2005 and have equivalent databases on each server. The 4 year old server with less memory and much slower processors can execute one particular query in 44 seconds routinely. The new server sits for many hours on the statement and I eventually just abort it. The execution plans also appear vastly different even though the databases are exactly the same except for having different compatibility levels and the new one has been reindexed and had the stats updated.
  5. Adriaan New Member

    It's not really a bug in the engine, as much as probabilities within a given set of data. As the data set contains more matches for an IN subquery, the NOT IN version will perform gradually better. Rather than a bug, it's the distinction between statistics and probability.

    The NOT EXISTS syntax might still perform slightly better than NOT IN.

    And do not discount beforehand the performance of a LEFT JOIN with NULL criteria on the matching column on the lookup table.

    ... new server - might have more CPUs than the old server had? Try running the query with OPTION (MAXDOP 1) at the very end (it would come after an ORDER BY clause).
  6. mpex04 New Member

    quote:Originally posted by Adriaan

    It's not really a bug in the engine, as much as probabilities within a given set of data. As the data set contains more matches for an IN subquery, the NOT IN version will perform gradually better. Rather than a bug, it's the distinction between statistics and probability.

    The NOT EXISTS syntax might still perform slightly better than NOT IN.

    And do not discount beforehand the performance of a LEFT JOIN with NULL criteria on the matching column on the lookup table.

    ... new server - might have more CPUs than the old server had? Try running the query with OPTION (MAXDOP 1) at the very end (it would come after an ORDER BY clause).


    We have worked with MS support and they determined it was a bug. We are talking about the engine going into an infinite loop under certain situations when you using either the NOT IN or NOT EXISTS statement is used under certain scenarios.

    For example, on the same server without changing any data in the database, I have upgraded in place from 2000 to 2005 and have run the exact same query. In 2000 the query ran in a couple seconds, on 2005 it took 14 hours. Common characteristics of this bug is an instant jump in cpu useage to 100% without any actual work being done.
  7. Adriaan New Member

    Sorry, didn't immediately pick up on the extreme execution time.

    Anyways, have you looked at not inserting those unmatched rows in the step before this?
  8. jpcw New Member

    I have a similar problem with several stored procedures. Basically the stored procedures work in under 2 seconds in sql 2000 but take 30+ minutes in 2005. If I run the sql outside a stored procedure it works perfectly but if I put it in the stored procedure without changing anything it has the problem.

    On one of the stored procedures I managed to take out the temp table and it fixed the problem. The only stored procedures I'm getting these symptoms with are ones involving temp tables.
  9. jpcw New Member

    Update. We found our problem. And the fix is completely illogical.

    When passing in 7 parameters if all of the first 3 parameters are passed in the procedure runs slow. If any of the first 3 are hard coded then it runs fast. Temp fix was to take one of the first 3 parameters and store it to a variable and then use that variable. ie in the below @start is used within the queries and it works. if @start1 is used it fails.


    PROCEDURE [dbo].[proc1]
    (@client smallint,
    @site smallint,
    @start1 datetime,
    @end datetime,
    @type int,
    @extn_number varchar(20) = NULL,
    @sort_criteria int) AS
    DECLARE
    @start datetime

    SELECT
    @start = @start1
  10. joechang New Member

    this is the parameter sniffing issue
    when the stored proc is compiled
    it uses the actual parameter values passed in to estimate statistics

    so if that particular value is extreme, you are screwed
    if it is translated to a variable declared inside the stored proc
    a generic value is used,
    which is always the same
    so it is more stable

    there used to be some MS documents that said parameters were better than variables
    until they finally realized what dumbass assumption that turned out to be

    search "parameter sniffing"
    for SQL 2005, look into plan guides, and the OPTIMIZE FOR hint
  11. jpcw New Member

    Thanks for the pointer, I'll go have a look.

Share This Page