SQL Server Performance Forum – Threads Archive

Difference bet. 2000 vs. 2005 Stored Procedure

I had a stored procedure on my SQL Server 2000 box that took about 30 minutes to run. When i ported it over to my new SQL Server 2005 test box, it takes about 16 hours to run. The stored procedure basically creates a temp table w/ about 2 million records with all the available accounts. It then creates a 2nd temp table w/ all the active accounts. Finally, it uses a delete statement w/ a simple subquery to delete the in-active accounts. Its during the deleting part of the procedure that the system seems to take the longest but it eventually completes it. Here is the delete statement: Delete #dr
From #dr dr
Where
dr.DirstatPolState+dr.DirstatPolAcctNbr+
dr.PolSerialNbr+dr.XdirstatPolicyYear
NOT IN
(Select ip.DirstatPolState+ip.DirstatPolAcctNbr+ip.PolSerialNbr+ip.XdirstatPolicyYear
From #included_policies ip) Any help w/ why it is taking so long would be greatly appreciated.

Are these two servers have the same HW/memory? Did you update statistics/reindexed the db after upgrade? When you are creating #dr and #included_policies table create coputed column for (dr.DirstatPolState+dr.DirstatPolAcctNbr+ dr.PolSerialNbr+dr.XdirstatPolicyYear ) columns…
MohammedU.
Moderator
SQL-Server-Performance.com

The concatenation method produces a query statement in less time than it would take you to write the proper query. The query itself, however, will probably take much more execution time than the proper query: DELETE FROM #dr
WHERE NOT EXISTS
(SELECT ip.* FROM #included_policies ip
WHERE #dr.DirstatPolState = ip.DirstatPolState
AND #dr.DirstatPolAcctNbr = ip.DirstatPolAcctNbr
AND #dr.PolSerialNbr = ip.PolSerialNbr
AND #dr.XdirstatPolicyYear = ip.XdirstatPolicyYear) For further improvement, I would add indexes on the listed columns in both temp tables, exactly in the order in which they’re mentioned in this query. But why are you not using the same WHERE NOT EXISTS clause when inserting into #dr? That would save you lots more time – deletion is always a relatively slow operation.

SQL Server 2005 seems to have a bug in the database engine when using the NOT IN clause. I have many examples where the new server that we purchased will not complete a query that runs in a few seconds on the older server. For example, I’m currently working on a migration to 2005 and have equivalent databases on each server. The 4 year old server with less memory and much slower processors can execute one particular query in 44 seconds routinely. The new server sits for many hours on the statement and I eventually just abort it. The execution plans also appear vastly different even though the databases are exactly the same except for having different compatibility levels and the new one has been reindexed and had the stats updated.

It’s not really a bug in the engine, as much as probabilities within a given set of data. As the data set contains more matches for an IN subquery, the NOT IN version will perform gradually better. Rather than a bug, it’s the distinction between statistics and probability. The NOT EXISTS syntax might still perform slightly better than NOT IN. And do not discount beforehand the performance of a LEFT JOIN with NULL criteria on the matching column on the lookup table. … new server – might have more CPUs than the old server had? Try running the query with OPTION (MAXDOP 1) at the very end (it would come after an ORDER BY clause).

quote:Originally posted by Adriaan It’s not really a bug in the engine, as much as probabilities within a given set of data. As the data set contains more matches for an IN subquery, the NOT IN version will perform gradually better. Rather than a bug, it’s the distinction between statistics and probability. The NOT EXISTS syntax might still perform slightly better than NOT IN. And do not discount beforehand the performance of a LEFT JOIN with NULL criteria on the matching column on the lookup table. … new server – might have more CPUs than the old server had? Try running the query with OPTION (MAXDOP 1) at the very end (it would come after an ORDER BY clause).

We have worked with MS support and they determined it was a bug. We are talking about the engine going into an infinite loop under certain situations when you using either the NOT IN or NOT EXISTS statement is used under certain scenarios. For example, on the same server without changing any data in the database, I have upgraded in place from 2000 to 2005 and have run the exact same query. In 2000 the query ran in a couple seconds, on 2005 it took 14 hours. Common characteristics of this bug is an instant jump in cpu useage to 100% without any actual work being done.

Sorry, didn’t immediately pick up on the extreme execution time. Anyways, have you looked at not inserting those unmatched rows in the step before this?

I have a similar problem with several stored procedures. Basically the stored procedures work in under 2 seconds in sql 2000 but take 30+ minutes in 2005. If I run the sql outside a stored procedure it works perfectly but if I put it in the stored procedure without changing anything it has the problem. On one of the stored procedures I managed to take out the temp table and it fixed the problem. The only stored procedures I’m getting these symptoms with are ones involving temp tables.

Update. We found our problem. And the fix is completely illogical. When passing in 7 parameters if all of the first 3 parameters are passed in the procedure runs slow. If any of the first 3 are hard coded then it runs fast. Temp fix was to take one of the first 3 parameters and store it to a variable and then use that variable. ie in the below @start is used within the queries and it works. if @start1 is used it fails.
PROCEDURE [dbo].[proc1]
(@client smallint,
@site smallint,
@start1 datetime,
@end datetime,
@type int,
@extn_number varchar(20) = NULL,
@sort_criteria int) AS
DECLARE
@start datetime SELECT
@start = @start1

this is the parameter sniffing issue
when the stored proc is compiled
it uses the actual parameter values passed in to estimate statistics so if that particular value is extreme, you are screwed
if it is translated to a variable declared inside the stored proc
a generic value is used,
which is always the same
so it is more stable there used to be some MS documents that said parameters were better than variables
until they finally realized what dumbass assumption that turned out to be search "parameter sniffing"
for SQL 2005, look into plan guides, and the OPTIMIZE FOR hint

Thanks for the pointer, I’ll go have a look.

]]>