SQL Server Performance

Performance Issue - 2000 Vs 2005

Discussion in 'SQL Server 2005 Performance Tuning for DBAs' started by harikrish, Oct 27, 2008.

  1. harikrish New Member

    Hi,
    We are migrating frm 2000 to 2005 and experiencing some performance issues in 2005.
    Example Query:select subcategory_id,subsubcategory,cat_code_3from #tmpsubsubcategory twhere str(subcategory_id) + cat_code_3 not in (select str(subcategory_id) + cat_code3 from stage.stg_sub_subcategory)
    Following is the execution plan in 2000 and 2005:
    2005
    select
    subcategory_id,subsubcategory,cat_code_3
    from #tmpsubsubcategory t
    where str(subcategory_id) + cat_code_3 not in
    (select str(subcategory_id) + cat_code3 from stage.stg_sub_subcategory)
    |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([Expr1008]))
    |--Nested Loops(Left Anti Semi Join)
    | |--Nested Loops(Left Anti Semi Join, WHERE:([Expr1008] IS NULL))
    | | |--Compute Scalar(DEFINE:([Expr1008]=str(CONVERT_IMPLICIT(float(53),[tempdb].[dbo].[#tmpsubsubcategory].[subcategory_id] as [t].[subcategory_id],0))+[tempdb].[dbo].[#tmpsubsubcategory].[cat_code_3] as [t].[cat_code_3]))
    | | | |--Table Scan(OBJECT:([tempdb].[dbo].[#tmpsubsubcategory] AS [t]))
    | | |--Top(TOP EXPRESSION:((1)))
    | | |--Table Scan(OBJECT:([CostcoCanada].[stage].[stg_sub_subcategory]))
    | |--Top(TOP EXPRESSION:((1)))
    | |--Table Scan(OBJECT:([CostcoCanada].[stage].[stg_sub_subcategory]), WHERE:((str(CONVERT_IMPLICIT(float(53),[CostcoCanada].[stage].[stg_sub_subcategory].[subcategory_id],0))+[CostcoCanada].[stage].[stg_sub_subcategory].[cat_code3]) IS NULL))
    |--Top(TOP EXPRESSION:((1)))
    |--Table Scan(OBJECT:([CostcoCanada].[stage].[stg_sub_subcategory]), WHERE:([Expr1008]=(str(CONVERT_IMPLICIT(float(53),[CostcoCanada].[stage].[stg_sub_subcategory].[subcategory_id],0))+[CostcoCanada].[stage].[stg_sub_subcategory].[cat_code3])))
    2000
    select
    subcategory_id,subsubcategory,cat_code_3
    from #tmpsubsubcategory t
    where str(subcategory_id) + cat_code_3 not in
    (select str(subcategory_id) + cat_code3 from stg_sub_subcategory)
    |--Hash Match(Right Anti Semi Join, HASH:([Expr1004])=([Expr1005]), RESIDUAL:([Expr1005]=[Expr1004]))
    |--Compute Scalar(DEFINE:([Expr1004]=str(Convert([stg_sub_subcategory].[subcategory_id]), NULL, NULL)+[stg_sub_subcategory].[cat_code3]))
    | |--Table Scan(OBJECT:([CRXCostcoCanadaStage].[dbo].[stg_sub_subcategory]))
    |--Compute Scalar(DEFINE:([Expr1005]=str(Convert([t].[subcategory_id]), NULL, NULL)+[t].[cat_code_3]))
    |--Nested Loops(Left Anti Semi Join)
    |--Nested Loops(Left Anti Semi Join, WHERE:(str(Convert([t].[subcategory_id]), NULL, NULL)+[t].[cat_code_3]=NULL))
    | |--Table Scan(OBJECT:([tempdb].[dbo].[#tmpsubsubcategory__________________________________________________________________________________________________00000000EDA1] AS [t]))
    | |--Row Count Spool
    | |--Top(1)
    | |--Table Scan(OBJECT:([CRXCostcoCanadaStage].[dbo].[stg_sub_subcategory]))
    |--Row Count Spool
    |--Table Scan(OBJECT:([CRXCostcoCanadaStage].[dbo].[stg_sub_subcategory]), WHERE:(str(Convert([stg_sub_subcategory].[subcategory_id]), NULL, NULL)+[stg_sub_subcategory].[cat_code3]=NULL))

    Here Nested loop is used in 2005 and Hash join is used by default in 2000. Is there any way the database engine can be tuned so that 2005 use Hash Join by default?
    Regards
    Hari
  2. FrankKalis Moderator

    I would start by tuning the query and probably rewrite it to a NOT EXISTS or a LEFT JOIN.
  3. Adriaan New Member

    You should drop the concatenation in the WHERE clause of the subquery, and just use AND. Any type of function that you apply to a column means that any index on that column will be ignored.
    select t.subcategory_id, t.subsubcategory, t.cat_code_3
    from #tmpsubsubcategory t
    where t.subcategory_id not in
    (select x.subcategory_id from stage.stg_sub_subcategory x WHERE x.cat_code_3 = t.cat_code_3)
    Or as a NOT EXISTS subquery:
    where NOT EXISTS
    (select x.* from stage.stg_sub_subcategory x WHERE t.subcategory_id = x.subcategory_id AND t.cat_code_3 = x.cat_code_3)
  4. harikrish New Member

    Thanks for the reply!
    The problem here is that this is not the only query; this is just a sample from our database. We are expecting same performance as that of 2000 or even better. But after migrating we are getting this issues. We are looking for a solution like making some changes in the DB setting or some Service Pack which will resolve the issue.
    Thanks
    Hari
  5. Saurabh Srivastava New Member

    Use hint in query to use Hash joins. However, hint usage is not recommended but if it improves performance than use it and change the code in future to avert troubles while migrating to future versions. I preassumed you have done Reindexing, Update statictics with full scan etc. after upgrading to 2005.
  6. Saurabh Srivastava New Member

    BTW- what service pack sql 2005 is running
  7. madhuottapalam New Member

    THere were few issues reported regarding performance degradation due to inefficient query plan in 2005 with compare to 2000. This was because of the architectural change like Statement level compilation. You may need to re-write the query or you may use Plan Forcing . SQL Server 2005 introduces the USE PLAN query hint. USE PLAN can be used to force the query optimizer to use a specified query plan for a query. read about this feature in BOL.
    Madhu
  8. Adriaan New Member

    You can throw any higher-level approach at it, but that still leaves the problem of poor use of T-SQL syntax.
    The problem in this specific query is caused by the use of concatenation, which makes it clear that the original developer(s) knew about subqueries, but not about correlated subqueries.
    [EDIT] Plus the concatenation means that you could get mismatches ...
    "ABC" can be a concatenation of "A" and "BC", or of "AB" and "C", or of "ABC" and an empty string, or of an empty string and "ABC" - you can't tell from "ABC" alone.
    What else did the developer(s) not know about? Time to do a performance audit as outlined in one of the articles on this site - which in fact should be mandatory reading for any developer, since it explains so well how T-SQL syntax, data structure, indexes, and even hardware (should) interact.
  9. Elisabeth Redei New Member

    Hi Hari,
    You didn't tell us how you upgraded but I suggest you run sp_updatestats to update all the statistics in the database. Or run UPDATE STATISTICS on all tables in the database if you want to specify a SAMPLE rate other than default.
    HTH!
    /Elisabeth
  10. preethi Member

    [quote user="Elisabeth Redei"] suggest you run sp_updatestats to update all the statistics in the database[/quote]
    Also, Make sure the indeces are rebuilt.
  11. gurucb New Member

    "THere were few issues reported regarding performance degradation due to inefficient query plan in 2005 with compare to 2000. This was because of the architectural change like Statement level compilation."
    I thought Statement Level recompilation was good introduction in SQL Server 2005. That apart There may be on major architectural difference in costing between Non Clustered and Clustered index.
    In SQL Server 2000 Non Clustered Index was chosen over Clustered index that sometimes caused Random Reads due to bookmark lookups. But costing calculation did not take into account randomness of IO thus cost as low though not appropriate.
    Coming to this problem:
    * First there are not indexes because every thing is Table Scan and updating statistics would only help if there are any autostats created.
    * Even if there are indexes they may not be used because correct data types are not being compared. That is why both SQL Server 2000 and SQL SErver 2005 are doing converts. SQL Server 2005 helps by specifying Convert_Implicit i.e SQL Server is forced to Convert due to mismatch of data types.
    * Create / Updated appropriate statistics and indexes for appropriate databases.
  12. Adriaan New Member

    Call me stubborn but the original query is still so wrong that rebuilding indexes or updating statistics will not make as much of a difference as a rewriting of the query.
    OP Hari mentioned that this was not the only query, but taken as an indication of the quality of the work done by the developer(s) they should really review the system. The concatenation technique is dangerous in that it allows false matches, so the query is not dependable. How many more queries like that? How many before the system is no longer considered trustworthy?

Share This Page