SQL Server Performance

Simple SELECT for view is taking long time

Discussion in 'T-SQL Performance Tuning for Developers' started by Rusty007, Dec 14, 2008.

  1. Rusty007 New Member

    Hi everybody,
    I have a DTS package that runs every morning to import data to SQL 2000 tables from Sybase. I'm having problem with one of the package from this friday. The scenario is as follows:
    The source for the DTS package is a view. The view does the select from the only one table of around 34 million rows and also has a join with the SELECT from same table using AS. It usually took around 2 hours. However, suddenly the job neither succeeded nor failed on this friday. I explicitly executed DTS package and found out that it get stuck on same data flow step which SELECT view without throwing any error.
    I would appreciate if anybody can throw some light on this issue.
    Thanks,
    Rusty
  2. satya Moderator

    This could be a blocking/locking issue on SYBASE whereby the resources are waiting to clearup in order to import rows?
    Check on SYBASE side and SQL Server side too for such indications.
  3. Adriaan New Member

    To add to what satya mentioned, since this is happening on a self-join, perhaps the table does not have a primary key, or no index on the join column(s), or no clustered index.
  4. Rusty007 New Member

    Thanks Satya and Adriaan,
    I really appreciate you guys precious time to help me out. Let me make myself more clear. Actually, I was just explaining about the process. However, the problem is in SQL 2000 since this occurs after everything is loaded on staging table. I checked for the blocking using SP_WHo2 'active' and didn't see any blocking. Adriaan is right the view does SELECT from TableA which is self joined. It has around 32 million rows and Its a heap table. There are some statistics though. I also updated statistisc with FULLScan. It succeeded but I'm still having problem with the SELECT.
    Thanks in Advance for your help!
    The script for the view is as follows:
    SET QUOTED_IDENTIFIER ON
    GO
    SET ANSI_NULLS ON
    GO
    ALTER VIEW dbo.X AS
    SELECT TOP 100 PERCENT *
    FROM (SELECT A.DIMENSION_NAME,A.DIMENSION_VALUE, CONVERT(datetime, CONVERT(char(2), DATEPART(mm, A.CLM_PD_DT)) + '/01/' + CONVERT(char(4), DATEPART(yyyy, A.CLM_PD_DT)))
    AS CLM_PD_DT, CONVERT(datetime, CONVERT(char(2), DATEPART(mm, B.CLM_PD_DT)) + '/01/' + CONVERT(char(4), DATEPART(yyyy,
    B.CLM_PD_DT))) AS PREV_CLM_PD_DT, A.RUN_DATE, B.RUN_DATE AS PREV_RUN_DATE, A.SUM_PAID, A.REC_COUNT, A.RUN_COUNT,
    B.SUM_PAID AS PREV_SUM_PAID, B.REC_COUNT AS PREV_REC_COUNT, A.SUM_PAID - B.SUM_PAID AS PAID_CHANGE,
    A.REC_COUNT - B.REC_COUNT AS REC_CHANGE
    FROM TableA A LEFT OUTER JOIN
    TableA B ON A.CLM_PD_DT = B.CLM_PD_DT AND A.RUN_COUNT = B.RUN_COUNT + 1 AND A.DIMENSION_NAME = B.DIMENSION_NAME AND A.DIMENSION_VALUE = B.DIMENSION_VALUE) tmp
    WHERE (RUN_COUNT <> '1') AND (CLM_PD_DT NOT IN
    (SELECT DISTINCT TOP 5 CONVERT(datetime, CONVERT(char(2), DATEPART(mm, CLM_PD_DT)) + '/01/' + CONVERT(char(4), DATEPART(yyyy, CLM_PD_DT)))
    FROM TableA ORDER BY CONVERT(datetime, CONVERT(char(2), DATEPART(mm, CLM_PD_DT)) + '/01/' + CONVERT(char(4), DATEPART(yyyy, CLM_PD_DT))) DESC ))
    ORDER BY CLM_PD_DT, RUN_DATE DESC
    GO
    SET QUOTED_IDENTIFIER OFF
    GO
    SET ANSI_NULLS ON
    GO


  5. Adriaan New Member

    Add an identity column to your heap table, and make that the primary key. With 32M rows, I'd also add some indexes on columns used for filtering.
    You'll be amazed.
  6. Adriaan New Member

    And by the way, with this much processing going on in the view, do not expect it to perform well.
    If you need to filter out results from this view, you would be better off re-writing this as a stored procedure, adding the filter criteria as part of the query statement.
  7. Rusty007 New Member

    Thanks Adriann for your precious time! Last time I had this issue I did update statistics and it worked. I'm not sure we have any unique column. So, I'm planning to create non-clustered index on one of the column. I'll update on this post.
    In the meantime, if anybody has been through this and any brilliant ideas, please keep me posted!~
    Thanks,
    Rusty
  8. Adriaan New Member

    The whole idea of adding an identity column is so that your heap table will have a unique column, that you can use as a PK.
    The values on the identity column will be meaningless, but a PK is just one of the things that SQL needs in order to perform well.
  9. satya Moderator

    When you are an index just keep in mind that if that is a non-clustered then make sure to include frequently used columns within the WHERE & JOIN clauses of your query, for clustered it is your own decision to make it the order you want.

Share This Page