SQL Server Performance

Query optimization issue : limit affected rows

Discussion in 'SQL Server 2005 Performance Tuning for DBAs' started by johnny_crash, Jul 10, 2007.

  1. johnny_crash New Member

    Hello all

    I have a query optimization problem, if anyone can help me.
    I have an application that periodically scans a database, but it only needs to look at the last 1000 rows, and not all database.

    I mention that my dabase is huge - this specific query i am trying to optimize joins 2 tables, with 4 million rows each

    So, my question is how can i apply my query to only the last rows from database, in the best way? I mention that i have tryed with

    select ... from (select top 100 * from X order by ... DESC) .... where ...

    By looking at statistics, i haven't noticed a performance increase, but contrary - a performance DECREASE.

    Any help would be apreciated
    Best regards,
    Johnny
  2. johnny_crash New Member

    forgot to tell :
    i am using microsoft sql 2005
  3. satya Moderator

    I feel thathttp://sqlserver-qa.net/blogs/perft...erver-2005-high-cpu-occurrence-why-it-is.aspx &http://sqlserver-qa.net/blogs/perft...shoot-slow-running-queries-in-sql-server.aspx are starting ones to see where it is lacking the performance.

    After all, if SQL Server isn't using an efficient execution plan, there must be something wrong with your table design or indexing strategy. If you designed your system well, and it still doesn't perform well, you must be lacking appropriate hardware resources. Plan guides are most commonly used to specify RECOMPILE or OPTIMIZE FOR query hints. However, you can also advise SQL Server to use a particular type of JOIN, use a predefined execution plan. In this regardhttp://sqlserver-qa.net/blogs/perft...op-stored-procedures-that-are-recompiled.aspx &http://sqlserver-qa.net/blogs/perft...-optimizer-reuses-for-better-performance.aspx fyi.

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  4. johnny_crash New Member

    Grretings<br /><br />Thank you for your quick answer. Yes, you are right, there is something very wrong with my table structure <img src='/community/emoticons/emotion-1.gif' alt=':)' /> but unfortunately i can't modify it ( we are only adding a module in an application for a client, that heavily relies on the current table structure ). I just have to take the best out of the current configuration.<br /><br />I will look at links you have posted, but still - wouldn't be a good ideea to tell SQL server somehow to only look at the last 1000 rows when executing the query ? <b>Does such a command exists ?</b><br /><br />Best regards,<br />Johnny
  5. satya Moderator

    SQL Server typically returns rows in whatever order is most efficient, and data values have no effect on which rows are returned when you use either TOP or SET ROWCOUNT.

    TOP is reproducibly faster than the SET ROWCOUNT statement, and it has an advantage availing from internal feature of the SQL Server if the table has no supporting indexes.

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  6. FrankKalis Moderator

    The "last 1000 rows" is only meaningful when you do something like a

    SELECT TOP 1000...
    ORDER BY ... DESC

    or

    SET ROWCOUNT 1000
    SELECT ...
    ORDER BY ... DESC

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
    Webmaster:http://www.insidesql.de
  7. FrankKalis Moderator

    quote:
    TOP is reproducibly faster than the SET ROWCOUNT statement, and it has an advantage availing from internal feature of the SQL Server if the table has no supporting indexes.
    Is it?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
    Webmaster:http://www.insidesql.de
  8. satya Moderator

    Yes, I have tested it before posting that Tip [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]...<br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center>
  9. satya Moderator

    Ok revising the question you have asked you could take help of TABLESAMPLE by "Limiting Result Sets by Using this clause":
    http://technet.microsoft.com/en-us/library/ms189108.aspx &http://blogs.netindonesia.net/kiki/articles/9103.aspx to go about it.

    quote:Originally posted by johnny_crash

    ......... but still - wouldn't be a good ideea to tell SQL server somehow to only look at the last 1000 rows when executing the query ? Does such a command exists ?

    Best regards,
    Johnny

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  10. FrankKalis Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by satya</i><br /><br />Yes, I have tested it before posting that Tip [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]...<br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Can you post a repro script?<br />From all the testings I've done and all the threads I've followed on this topic, I always thought that there is hardly any significant difference at all.<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  11. FrankKalis Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by satya</i><br /><br />Ok revising the question you have asked you could take help of TABLESAMPLE by "Limiting Result Sets by Using this clause":<br /<a target="_blank" href=http://technet.microsoft.com/en-us/library/ms189108.aspx>http://technet.microsoft.com/en-us/library/ms189108.aspx</a> &<a target="_blank" href=http://blogs.netindonesia.net/kiki/articles/9103.aspx>http://blogs.netindonesia.net/kiki/articles/9103.aspx</a> to go about it.<br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by johnny_crash</i><br /><br />......... but still - wouldn't be a good ideea to tell SQL server somehow to only look at the last 1000 rows when executing the query ? <b>Does such a command exists ?</b><br /><br />Best regards,<br />Johnny<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />You don't have control over the order here. So, no guarantee that the "last rows" are returned. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  12. satya Moderator

    Check your email fyi.

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  13. FrankKalis Moderator

  14. satya Moderator

    BOL <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />If a SELECT statement that includes TOP also has an ORDER BY clause, the rows to be returned are selected from the ordered result set. The whole result set is built in the specified order and the top n rows in the ordered result set are returned. If WITH TIES is also specified, all rows that contain the last value returned by the ORDER BY clause are returned, even if doing this exceeds the number specified by expression.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"> So <br />by using a combination of TABLESAMPLE and TOP, the data that you obtain will near-requirement results.<br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">.............<br />You don't have control over the order here. So, no guarantee that the "last rows" are returned. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center>
  15. FrankKalis Moderator

    Satya, have you tried so? [<img src='/community/emoticons/emotion-5.gif' alt=';)' />]<br />The order of rows is of importance to the OP.<br /><br />Run this:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT TOP 100 *<br /> FROM Northwind.db:confused:rders<br /> ORDER BY OrderID<br />-- <br />SET ROWCOUNT 100<br /> SELECT *<br /> FROM Northwind.db:confused:rders<br /> ORDER BY OrderID<br /></font id="code"></pre id="code"><br />Both statements will return identical resultsets. The "first" 100 rows. Just as you would expect because of the ORDER BY clause. Now run this:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT TOP 100 *<br /> FROM Northwind.db:confused:rders<br /> TABLESAMPLE (100 ROWS)<br /> ORDER BY OrderID<br /></font id="code"></pre id="code"><br />The resultset is completely different from the first two ones. It appears that the TABLESAMPLE clause is applied first to the Orders table. The intermediate resultset of this operation than is sorted according to the ORDER BY clause and the "first" 100 rows are returned as final resultset. If you run this subsequently several times you'll observe that the resultset mostly differs, if you get a resultset at all, which also isn't guaranteed. So, this combination of TOP and TABLESAMPLE yields an unpredictable resultset, which isn't appropriate in this case here. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  16. FrankKalis Moderator

    Btw, just realized that when you change the last SELECT to:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT TOP 100 *<br /> FROM Northwind.db:confused:rders<br /> TABLESAMPLE (10 ROWS)<br /> ORDER BY OrderID<br /></font id="code"></pre id="code"><br />it gets even more funny. If you get a resultset at all it appears to be unpredictable in the number of rows returned. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  17. johnny_crash New Member

    Greetings<br /><br />I noticed some traffic around here <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br />Thank you for your time spent with replyes,<br />but i can't use TABLESAMPLE, as it requires a compatibility level of 90% <br />I cannot change any of database's settings.<br /><br />BTW<br />I tryed to use NEWID function, as in the following example :<br /><br /> SELECT * FROM Sales.SalesOrderDetail<br /><br />WHERE 0.01 &gt;= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float) <br /><br />and looked at statistics. Even if the overall row lookup decreased 3 times, the spent time had actually increased becouse of string manipulation caused by NEWID function - wich is much slower.<br /><br />Any other ideas? <img src='/community/emoticons/emotion-2.gif' alt=':D' />
  18. FrankKalis Moderator

  19. johnny_crash New Member

    Sorry

    I need "last" rows. I haven't noticed the article was talking about "random" rows, especially that in my particular test case, has selected the "last" rows
  20. FrankKalis Moderator

    Thx.
    So, your query looks like

    select ... from (select top 100 * from X order by ... DESC) .... where ...

    It this the complete statement or why are you using a derived table here?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
    Webmaster:http://www.insidesql.de
  21. johnny_crash New Member

    Frank,

    That was only an attempt to optimize it.
    My query is rather simple in its structure (only two table joins, with few conditions)
    If helps you in any way... my query is :



    select
    y.id,
    y.carrier,
    y.flight_number,
    y.flight_date,
    y.departure_time,
    y.departure_date,
    y.aircraft_type,
    sum(CASE WHEN x.action_status='NOT' THEN x.fbl_pieces ELSE x.exp_pieces END),
    sum(CASE WHEN x.action_status='NOT' THEN x.fbl_weight ELSE x.exp_weight END),
    sum(CASE WHEN x.action_status='ACC' THEN x.exp_pieces ELSE 0 END),
    sum(CASE WHEN x.action_status='ACC' THEN x.exp_weight ELSE 0 END),
    y.departed
    from
    XXX x with (nolock),
    YYY y with (nolock)
    where
    (y.departure_date='2007-7-6 00:00:00.0' and y.departure_time>='1000' and y.departure_time<='2000' ) and
    y.origin='AMS' and
    x.origin='AMS' and
    x.carrier = y.carrier and
    x.flight_number = y.flight_number and
    x.flight_date = y.flight_date
    group by
    y.seq,
    y.carrier,
    y.flight_number,
    y.flight_date,
    y.departure_time,
    y.departure_date,
    y.aircraft_type,
    y.departed
    order by y.seq DESC


    Two tables are used : YYY and XXX, each of them with a huge number of rows
    so i need to write this as optimized as possible.
    I have removed the SUM and GROUP BY statements, but they don't make much difference.

    Thank you for your help,
    Johnny
  22. satya Moderator

    As you are using SQL 2005 there shouldn;t be a problem of setting up it to 90 compatibility mode as it will be.

    Are you querying against a SQL 2000 server?

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  23. Madhivanan Moderator

    quote:Originally posted by FrankKalis


    quote:
    TOP is reproducibly faster than the SET ROWCOUNT statement, and it has an advantage availing from internal feature of the SQL Server if the table has no supporting indexes.
    Is it?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
    Webmaster:http://www.insidesql.de
    FYI
    http://www.sql-server-performance.com/forum/topic.asp?ARCHIVE=true&TOPIC_ID=1140

    Madhivanan

    Failing to plan is Planning to fail
  24. FrankKalis Moderator

    Thanks Madhi, I knew that. Though it is from back in 2003, I'm not sure this exactly holds anymore. And I also believe that there are only a few cases in which in heap is appropriate.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
    Webmaster:http://www.insidesql.de
  25. Madhivanan Moderator

  26. FrankKalis Moderator

  27. FrankKalis Moderator

    quote:Originally posted by johnny_crash

    Frank,

    That was only an attempt to optimize it.
    My query is rather simple in its structure (only two table joins, with few conditions)
    If helps you in any way... my query is :



    select
    leg.seq,
    leg.carrier,
    leg.flight_number,
    leg.flight_date,
    leg.departure_time,
    leg.departure_date,
    leg.aircraft_type,
    sum(CASE WHEN am.action_status='NOT' THEN am.fbl_pieces ELSE am.exp_pieces END),
    sum(CASE WHEN am.action_status='NOT' THEN am.fbl_weight ELSE am.exp_weight END),
    sum(CASE WHEN am.action_status='ACC' THEN am.exp_pieces ELSE 0 END),
    sum(CASE WHEN am.action_status='ACC' THEN am.exp_weight ELSE 0 END),
    leg.departed
    from
    awb_move am with (nolock),
    flight_leg leg with (nolock)
    where
    (leg.departure_date='2007-7-6 00:00:00.0' and leg.departure_time>='1000' and leg.departure_time<='2000' ) and
    leg.origin='AMS' and
    am.origin='AMS' and
    am.carrier = leg.carrier and
    am.flight_number = leg.flight_number and
    am.flight_date = leg.flight_date
    group by
    leg.seq,
    leg.carrier,
    leg.flight_number,
    leg.flight_date,
    leg.departure_time,
    leg.departure_date,
    leg.aircraft_type,
    leg.departed
    order by leg.seq DESC


    Two tables are used : flight_leg and awb_move, each of them with a huge number of rows
    so i need to write this as optimized as possible.
    I have removed the SUM and GROUP BY statements, but they don't make much difference.

    Thank you for your help,
    Johnny

    How many rows is this query supposed to return? And what does the execution plan look like?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
    Webmaster:http://www.insidesql.de
  28. sunilmadan New Member

Share This Page