SQL Server Performance

What is the difference?!!!

Discussion in 'T-SQL Performance Tuning for Developers' started by plextoR, Dec 19, 2005.

  1. plextoR New Member

    What is the difference between these 3 statements:<br /><br />1- SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; 32<br /> This takes about 2 seconds.<br /><br />2- SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; (SELECT schYearID FROM scholasticyear WHERE currentYear = 1)<br /> This takes about 45 seconds!!<br /><br />3- DECLARE @SchYearID INT<br />SET @SchYearID = (SELECT SchYearID FROM ScholasticYear WHERE CurrentYear = 1)<br />SELECT TOP 1 StudentIDFROM UniqStdSchBasicData WHERE SchYearID &gt; @SchYearID<br /> This takes more than 2 Mins!!!!<br /><br />&lt;font color="blue"&gt<img src='/community/emoticons/emotion-4.gif' alt=';p' />lextoR&lt;/font id="blue"&gt;
  2. ranjitjain New Member

    Do you have any index on SchYearID column.
    Also update statistics and how many rows your table has.
    Something is wrong as these query should not take so much time.
  3. plextoR New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by ranjitjain</i><br /><br />Do you have any index on SchYearID column.<br />Also update statistics and how many rows your table has.<br />Something is wrong as these query should not take so much time.<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br />Yes the SchYearID column is indexed.<br />and the table schYeatID now has only 32 rows!<br /><br />&lt;font color="blue"&gt<img src='/community/emoticons/emotion-4.gif' alt=';p' />lextoR&lt;/font id="blue"&gt;
  4. FrankKalis Moderator

  5. ranjitjain New Member

    Can you check execution plan for this query.
    and how many rows and index details for this table.

    SELECT SchYearID FROM ScholasticYear WHERE CurrentYear = 1
  6. Madhivanan Moderator

    1 Querying only one table
    2 Querying two tables
    3 Dynamic SQL

    Third should be

    DECLARE @SchYearID INT
    Select @SchYearID = SchYearID FROM ScholasticYear WHERE CurrentYear = 1
    SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID > @SchYearID

    to avoid error if the resultant is more than one value

    How many rows do each table have?

    Madhivanan

    Failing to plan is Planning to fail
  7. Madhivanan Moderator

    Also Top should be used only if you use Order by Clause otherwise it will randomly return a value

    Madhivanan

    Failing to plan is Planning to fail
  8. FrankKalis Moderator

    Madhivanan, what is dynamic about


    DECLARE @SchYearID INT
    SET @SchYearID = (SELECT SchYearID FROM ScholasticYear WHERE CurrentYear = 1)
    SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID > @SchYearID

    ?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Heute schon gebloggt?http://www.insidesql.de/blogs
  9. SQLDBcontrol New Member

    Essentially the main difference is that in the 2nd and 3rd examples you are selecting a value for schYearId from the scholasticyear table where the currentYear is 1. Whereas in the first example you have hardcoded a value for schYearId.<br /><br />The third example is slower than the second example because you are running it as two seperate statements, whereas in the second example you run it as part of the one query.<br /><br />With regards to overall performance, this has nothing to do with whether you have an index or not on the schYearId column in UniqStdSchBasicData. You've already proved that the first query runs fine. The problem lies with the search for a schYearId based on the currentYear column in the scholasticyear table. So that's where I'd look at a candidate for an index (possibly a clustered, depending on the uniqueness of that column?).<br /><br />Also, try this for the first part of the third statement (not sure how much difference that would make:<br /><br />DECLARE @SchYearID INT<br />SELECT @SchYearID = SchYearID FROM ScholasticYear WHERE CurrentYear = 1<br /><br />As has been mentioned, you need to use the ORDER BY clause otherwise the results will be ordered based on the clustered index in the table by default.<br /><br />Hope that helps,<br /><br />Karl<br /><br /><br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by plextoR</i><br /><br />What is the difference between these 3 statements:<br /><br />1- SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; 32<br /> This takes about 2 seconds.<br /><br />2- SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; (SELECT schYearID FROM scholasticyear WHERE currentYear = 1)<br /> This takes about 45 seconds!!<br /><br />3- DECLARE @SchYearID INT<br />SET @SchYearID = (SELECT SchYearID FROM ScholasticYear WHERE CurrentYear = 1)<br />SELECT TOP 1 StudentIDFROM UniqStdSchBasicData WHERE SchYearID &gt; @SchYearID<br /> This takes more than 2 Mins!!!!<br /><br />&lt;font color="blue"&gt<img src='/community/emoticons/emotion-4.gif' alt=';p' />lextoR&lt;/font id="blue"&gt;<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br />Karl Grambow<br /><br />www.sqldbcontrol.com
  10. Madhivanan Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by FrankKalis</i><br /><br />Madhivanan, what is dynamic about<br /><pre id="code"><font face="courier" size="2" id="code"><br />DECLARE @SchYearID INT<br />SET @SchYearID = (SELECT SchYearID FROM ScholasticYear WHERE CurrentYear = 1)<br />SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; @SchYearID<br /></font id="code"></pre id="code"><br />?<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br /><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Well. At first look I thought Exec is used. Thanks for pointing that [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />Madhivanan<br /><br />Failing to plan is Planning to fail
  11. Hartmut5 New Member

    plextoR,

    Could you give us the datatype of the schYearID column in both the UniqStdSchBasicData table and the scholasticyear table?

    There could be a datatype precedence problem here.


    -Hartmut5
  12. plextoR New Member

    The scholasticYear table has only 32 tables.

    quote:Could you give us the datatype of the schYearID column
    Id is INT


    quote:The problem lies with the search for a schYearId based on the currentYear column in the scholasticyear table
    The folowing statements takes 0 seconds!!!
    SELECT schYearID FROM scholasticyear WHERE currentYear = 1

    DECLARE @SchYearID INT
    SELECT @SchYearID = SchYearID FROM ScholasticYear WHERE CurrentYear = 1


    FYI,
    - The UniqStdSchBasicData is a view that returns count of 1422408 rows in 12 seconds.

    - The id field of the StdSchBasicData table (not the uniq.. View) has 1 culstered index and one non-clusterd.
    actually i don't know what is the idea behind that! but i also don't think that this is the problem.

    in the execution plan i found that:

    For statement 1:
    Clustered Index Scan -> Cost 86% Arguments -> OBJECT(StdSchBasicData.[PK_StdSchBasicData]),WHERE: (StdSchBasicData.SchYearID=32)
    ------------

    For statement 2:
    Hash Match/Aggregate -> Cost 40% Arguments -> HASH: (StdSchBasicData.StudentID,StdSchBasicData.SchYearID), RESIDUAL(StdSchBasicData.StudentID = StdSchBasicData.StudentID AND StdSchBasicData.SchYearID=StdSchBasicData.SchYearID),[expr1004]=MAX(StdSchBasicData.StdSchBasicDat

    Clustered Index Scan -> Cost 58% Arguments -> StdSchBasicData.[PK_StdSchBasicData]
    ------------

    For statement 3:
    The first part (setting the @SchYearID param) 0.07%
    The Second Part:
    Hash Match/Aggregate -> Cost 39% Arguments -> as statement no 2
    Clustered Index Scan -> Cost 58% Arguments -> as statement no 2
  13. SQLDBcontrol New Member

    This is statement 3:

    3- DECLARE @SchYearID INT
    SELECT @SchYearID = SchYearID FROM ScholasticYear WHERE CurrentYear = 1
    ---
    SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID > @SchYearID

    The first part takes 0 seconds as you pointed out. So it is the second part that must be taking 2 minutes.

    But, the second part is identical to the first statement, "SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID > 32" except for the fact that you have hardcoded the value of 32 in the first statement.

    Now, you said that the first statement only takes 2 seconds.

    So what we've got to find out is why the last part of the third statement takes 2 minutes when you pass a parameter value to it as opposed to 2 seconds when you're using a hardcoded value.

    It could be that different values (other than 32) cause it to run slower or bad execution plan. The hash/match aggregate doesn't make sense either.

    Anyway, I'd start by looking at the first statement and passing different values to it and also experimenting with passing parameter values to it.



    quote:Originally posted by plextoR

    The scholasticYear table has only 32 tables.

    quote:Could you give us the datatype of the schYearID column
    Id is INT


    quote:The problem lies with the search for a schYearId based on the currentYear column in the scholasticyear table
    The folowing statements takes 0 seconds!!!
    SELECT schYearID FROM scholasticyear WHERE currentYear = 1

    DECLARE @SchYearID INT
    SELECT @SchYearID = SchYearID FROM ScholasticYear WHERE CurrentYear = 1


    FYI,
    - The UniqStdSchBasicData is a view that returns count of 1422408 rows in 12 seconds.

    - The id field of the StdSchBasicData table (not the uniq.. View) has 1 culstered index and one non-clusterd.
    actually i don't know what is the idea behind that! but i also don't think that this is the problem.

    in the execution plan i found that:

    For statement 1:
    Clustered Index Scan -> Cost 86% Arguments -> OBJECT(StdSchBasicData.[PK_StdSchBasicData]),WHERE: (StdSchBasicData.SchYearID=32)
    ------------

    For statement 2:
    Hash Match/Aggregate -> Cost 40% Arguments -> HASH: (StdSchBasicData.StudentID,StdSchBasicData.SchYearID), RESIDUAL(StdSchBasicData.StudentID = StdSchBasicData.StudentID AND StdSchBasicData.SchYearID=StdSchBasicData.SchYearID),[expr1004]=MAX(StdSchBasicData.StdSchBasicDat

    Clustered Index Scan -> Cost 58% Arguments -> StdSchBasicData.[PK_StdSchBasicData]
    ------------

    For statement 3:
    The first part (setting the @SchYearID param) 0.07%
    The Second Part:
    Hash Match/Aggregate -> Cost 39% Arguments -> as statement no 2
    Clustered Index Scan -> Cost 58% Arguments -> as statement no 2


    Karl Grambow

    www.sqldbcontrol.com
  14. Twan New Member

    Hi ya,

    The problem is that in statements 2 and 3 SQL doesn't know what the value of SchYearID is going to be at the time that the statements are compiled as a single batch.

    To get around that you could split the batch into two, either by first getting the value of SchYearID and passing it back to the application, or if this is all in a proc then saving it as a local variable

    and then call the final statement either from the application or by calling a sub-procedure and passing SchYearID as a parameter

    also the index on StdSchBasicData.SchYearID appears to be clustered index (or part of it), which is not going to be that good for performance. A non-clustered index on that column would help

    Cheers
    Twan
  15. mmarovic Active Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by plextoR</i><br /><br />What is the difference between these 3 statements:<br /><br />1- SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; 32<br /> This takes about 2 seconds.<br /><br />2- SELECT TOP 1 StudentID FROM UniqStdSchBasicData WHERE SchYearID &gt; (SELECT schYearID FROM scholasticyear WHERE currentYear = 1)<br /> This takes about 45 seconds!!<br /><br />3- DECLARE @SchYearID INT<br />SET @SchYearID = (SELECT SchYearID FROM ScholasticYear WHERE CurrentYear = 1)<br />SELECT TOP 1 StudentIDFROM UniqStdSchBasicData WHERE SchYearID &gt; @SchYearID<br /> This takes more than 2 Mins!!!!<br /><br />&lt;font color="blue"&gt<img src='/community/emoticons/emotion-4.gif' alt=';p' />lextoR&lt;/font id="blue"&gt;<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">Twan is right about why queries behaves the way you described and possible workarounds. Another solution you can also try is:<br /><br /><pre id="code"><font face="courier" size="2" id="code">select top 1 StudentID<br />from scholasticyear y<br />inner loop join UniqStdSchBasicData d (index=&lt;index on schYearID&gt<img src='/community/emoticons/emotion-5.gif' alt=';)' /> on d.SchYearID &gt; y.SchYearID<br />where y.currentYear = 1<br />option (force order)</font id="code"></pre id="code">
  16. Hartmut5 New Member

    In all 3 cases, the database is doing a clustered index scan on the StdSchBasicData table. Usually this means that it is reading the entire table, but in this case, with "top 1", it is most likely scanning records just until it finds one that matches the criteria. I believe it was stated that the table has about 1.4 million rows. This will take a long time to read, depending on how soon it finds a match. If no rows meet the criteria, then it will end up reading all 1.4 million rows before returning a result. If it finds a match in the first couple rows, it will return a result in less than a second.

    The execution time will also depend a lot on whether or not the data pages are already in the buffer pool or not. To truly test the performance, you will probably want to run DBCC DROPCLEANBUFFERS before each query to ensure that no data pages are stored in cache before the query is run.

    It would be helpful to see the entire execution plan, as it looks like it is doing a hash-type self join on the 1.4 million row table for the second and third queries. You can get the full execution plan by using SET SHOWPLAN_TEXT ON before running the query (this will not actually execute the query, it will just give you the plan).

    In any case, you generally want to avoid clustered index scans, especially on large or wide tables. Non-clustered index scans are better than clustered, but index seeks are what you really want when dealing with a large amount of data.

    Double check to make sure that there is an index on the SchYearID column in the StdSchBasicData table.


    -Hartmut5

Share This Page