Query optimization issue : limit affected rows | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Query optimization issue : limit affected rows

Hello all I have a query optimization problem, if anyone can help me.
I have an application that periodically scans a database, but it only needs to look at the last 1000 rows, and not all database. I mention that my dabase is huge – this specific query i am trying to optimize joins 2 tables, with 4 million rows each So, my question is how can i apply my query to only the last rows from database, in the best way? I mention that i have tryed with select … from (select top 100 * from X order by … DESC) …. where … By looking at statistics, i haven’t noticed a performance increase, but contrary – a performance DECREASE. Any help would be apreciated
Best regards,
Johnny

forgot to tell :
i am using microsoft sql 2005
I feel thathttp://sqlserver-qa.net/blogs/perft…erver-2005-high-cpu-occurrence-why-it-is.aspx &http://sqlserver-qa.net/blogs/perft…shoot-slow-running-queries-in-sql-server.aspx are starting ones to see where it is lacking the performance. After all, if SQL Server isn’t using an efficient execution plan, there must be something wrong with your table design or indexing strategy. If you designed your system well, and it still doesn’t perform well, you must be lacking appropriate hardware resources. Plan guides are most commonly used to specify RECOMPILE or OPTIMIZE FOR query hints. However, you can also advise SQL Server to use a particular type of JOIN, use a predefined execution plan. In this regardhttp://sqlserver-qa.net/blogs/perft…op-stored-procedures-that-are-recompiled.aspx &http://sqlserver-qa.net/blogs/perft…-optimizer-reuses-for-better-performance.aspx fyi. Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
Grretings<br /><br />Thank you for your quick answer. Yes, you are right, there is something very wrong with my table structure <img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /> but unfortunately i can’t modify it ( we are only adding a module in an application for a client, that heavily relies on the current table structure ). I just have to take the best out of the current configuration.<br /><br />I will look at links you have posted, but still – wouldn’t be a good ideea to tell SQL server somehow to only look at the last 1000 rows when executing the query ? <b>Does such a command exists ?</b><br /><br />Best regards,<br />Johnny
SQL Server typically returns rows in whatever order is most efficient, and data values have no effect on which rows are returned when you use either TOP or SET ROWCOUNT. TOP is reproducibly faster than the SET ROWCOUNT statement, and it has an advantage availing from internal feature of the SQL Server if the table has no supporting indexes. Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
The "last 1000 rows" is only meaningful when you do something like a SELECT TOP 1000…
ORDER BY … DESC or SET ROWCOUNT 1000
SELECT …
ORDER BY … DESC —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
quote:
TOP is reproducibly faster than the SET ROWCOUNT statement, and it has an advantage availing from internal feature of the SQL Server if the table has no supporting indexes.
Is it? —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
Yes, I have tested it before posting that Tip [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]…<br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center>
Ok revising the question you have asked you could take help of TABLESAMPLE by "Limiting Result Sets by Using this clause":
http://technet.microsoft.com/en-us/library/ms189108.aspx &http://blogs.netindonesia.net/kiki/articles/9103.aspx to go about it.
quote:Originally posted by johnny_crash ……… but still – wouldn’t be a good ideea to tell SQL server somehow to only look at the last 1000 rows when executing the query ? Does such a command exists ? Best regards,
Johnny

Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by satya</i><br /><br />Yes, I have tested it before posting that Tip [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]…<br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Can you post a repro script?<br />From all the testings I’ve done and all the threads I’ve followed on this topic, I always thought that there is hardly any significant difference at all.<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by satya</i><br /><br />Ok revising the question you have asked you could take help of TABLESAMPLE by "Limiting Result Sets by Using this clause":<br /<a target="_blank" href=http://technet.microsoft.com/en-us/library/ms189108.aspx>http://technet.microsoft.com/en-us/library/ms189108.aspx</a> &<a target="_blank" href=http://blogs.netindonesia.net/kiki/articles/9103.aspx>http://blogs.netindonesia.net/kiki/articles/9103.aspx</a> to go about it.<br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by johnny_crash</i><br /><br />……… but still – wouldn’t be a good ideea to tell SQL server somehow to only look at the last 1000 rows when executing the query ? <b>Does such a command exists ?</b><br /><br />Best regards,<br />Johnny<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />You don’t have control over the order here. So, no guarantee that the "last rows" are returned. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
Check your email fyi. Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
Nothing arrived yet. —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
BOL <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />If a SELECT statement that includes TOP also has an ORDER BY clause, the rows to be returned are selected from the ordered result set. The whole result set is built in the specified order and the top n rows in the ordered result set are returned. If WITH TIES is also specified, all rows that contain the last value returned by the ORDER BY clause are returned, even if doing this exceeds the number specified by expression.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"> So <br />by using a combination of TABLESAMPLE and TOP, the data that you obtain will near-requirement results.<br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">………….<br />You don’t have control over the order here. So, no guarantee that the "last rows" are returned. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br /><b>Satya SKJ</b><br />Microsoft SQL Server MVP<br />Writer, Contributing Editor & Moderator<br /<a target="_blank" href=http://www.SQL-Server-Performance.Com>http://www.SQL-Server-Performance.Com</a><br /><center><font color="teal"><font size="1">This posting is provided AS IS with no rights for the sake of <i>knowledge sharing. <hr noshade size="1">Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.</i></font id="size1"></font id="teal"></center>
Satya, have you tried so? [<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ />]<br />The order of rows is of importance to the OP.<br /><br />Run this:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT TOP 100 *<br /> FROM Northwind.db:confused:rders<br /> ORDER BY OrderID<br />– <br />SET ROWCOUNT 100<br /> SELECT *<br /> FROM Northwind.db:confused:rders<br /> ORDER BY OrderID<br /></font id="code"></pre id="code"><br />Both statements will return identical resultsets. The "first" 100 rows. Just as you would expect because of the ORDER BY clause. Now run this:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT TOP 100 *<br /> FROM Northwind.db:confused:rders<br /> TABLESAMPLE (100 ROWS)<br /> ORDER BY OrderID<br /></font id="code"></pre id="code"><br />The resultset is completely different from the first two ones. It appears that the TABLESAMPLE clause is applied first to the Orders table. The intermediate resultset of this operation than is sorted according to the ORDER BY clause and the "first" 100 rows are returned as final resultset. If you run this subsequently several times you’ll observe that the resultset mostly differs, if you get a resultset at all, which also isn’t guaranteed. So, this combination of TOP and TABLESAMPLE yields an unpredictable resultset, which isn’t appropriate in this case here. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
Btw, just realized that when you change the last SELECT to:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT TOP 100 *<br /> FROM Northwind.db:confused:rders<br /> TABLESAMPLE (10 ROWS)<br /> ORDER BY OrderID<br /></font id="code"></pre id="code"><br />it gets even more funny. If you get a resultset at all it appears to be unpredictable in the number of rows returned. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Contributing Editor, Writer & Forum Moderator<a target="_blank" href=http://www.sql-server-performance.com>http://www.sql-server-performance.com</a><br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
Greetings<br /><br />I noticed some traffic around here <img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /><br />Thank you for your time spent with replyes,<br />but i can’t use TABLESAMPLE, as it requires a compatibility level of 90% <br />I cannot change any of database’s settings.<br /><br />BTW<br />I tryed to use NEWID function, as in the following example :<br /><br /> SELECT * FROM Sales.SalesOrderDetail<br /><br />WHERE 0.01 &gt;= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float) <br /><br />and looked at statistics. Even if the overall row lookup decreased 3 times, the spent time had actually increased becouse of string manipulation caused by NEWID function – wich is much slower.<br /><br />Any other ideas? <img src=’/community/emoticons/emotion-2.gif’ alt=’:D‘ />
I’m confused now. Are you looking for "randomly" selected rows or the "last" rows? —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
Sorry I need "last" rows. I haven’t noticed the article was talking about "random" rows, especially that in my particular test case, has selected the "last" rows
Thx.
So, your query looks like select … from (select top 100 * from X order by … DESC) …. where … It this the complete statement or why are you using a derived table here? —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
Frank, That was only an attempt to optimize it.
My query is rather simple in its structure (only two table joins, with few conditions)
If helps you in any way… my query is : select
y.id,
y.carrier,
y.flight_number,
y.flight_date,
y.departure_time,
y.departure_date,
y.aircraft_type,
sum(CASE WHEN x.action_status=’NOT’ THEN x.fbl_pieces ELSE x.exp_pieces END),
sum(CASE WHEN x.action_status=’NOT’ THEN x.fbl_weight ELSE x.exp_weight END),
sum(CASE WHEN x.action_status=’ACC’ THEN x.exp_pieces ELSE 0 END),
sum(CASE WHEN x.action_status=’ACC’ THEN x.exp_weight ELSE 0 END),
y.departed
from
XXX x with (nolock),
YYY y with (nolock)
where
(y.departure_date=’2007-7-6 00:00:00.0′ and y.departure_time>=’1000′ and y.departure_time<=’2000′ ) and
y.origin=’AMS’ and
x.origin=’AMS’ and
x.carrier = y.carrier and
x.flight_number = y.flight_number and
x.flight_date = y.flight_date
group by
y.seq,
y.carrier,
y.flight_number,
y.flight_date,
y.departure_time,
y.departure_date,
y.aircraft_type,
y.departed
order by y.seq DESC
Two tables are used : YYY and XXX, each of them with a huge number of rows
so i need to write this as optimized as possible.
I have removed the SUM and GROUP BY statements, but they don’t make much difference. Thank you for your help,
Johnny

As you are using SQL 2005 there shouldn;t be a problem of setting up it to 90 compatibility mode as it will be. Are you querying against a SQL 2000 server? Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
quote:Originally posted by FrankKalis
quote:
TOP is reproducibly faster than the SET ROWCOUNT statement, and it has an advantage availing from internal feature of the SQL Server if the table has no supporting indexes.
Is it? —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
FYI
http://www.sql-server-performance.com/forum/topic.asp?ARCHIVE=true&TOPIC_ID=1140 Madhivanan Failing to plan is Planning to fail
Thanks Madhi, I knew that. Though it is from back in 2003, I’m not sure this exactly holds anymore. And I also believe that there are only a few cases in which in heap is appropriate. —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
Yes. TABLESAMPLE has some problem
Read section titled important
http://technet.microsoft.com/en-us/library/ms189108.aspx Madhivanan Failing to plan is Planning to fail
quote:Originally posted by Madhivanan Yes. TABLESAMPLE has some problem
Read section titled important
http://technet.microsoft.com/en-us/library/ms189108.aspx Madhivanan Failing to plan is Planning to fail
Also:http://toponewithties.blogspot.com/2005/08/sampling-using-tablesample.html
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
quote:Originally posted by johnny_crash Frank, That was only an attempt to optimize it.
My query is rather simple in its structure (only two table joins, with few conditions)
If helps you in any way… my query is : select
leg.seq,
leg.carrier,
leg.flight_number,
leg.flight_date,
leg.departure_time,
leg.departure_date,
leg.aircraft_type,
sum(CASE WHEN am.action_status=’NOT’ THEN am.fbl_pieces ELSE am.exp_pieces END),
sum(CASE WHEN am.action_status=’NOT’ THEN am.fbl_weight ELSE am.exp_weight END),
sum(CASE WHEN am.action_status=’ACC’ THEN am.exp_pieces ELSE 0 END),
sum(CASE WHEN am.action_status=’ACC’ THEN am.exp_weight ELSE 0 END),
leg.departed
from
awb_move am with (nolock),
flight_leg leg with (nolock)
where
(leg.departure_date=’2007-7-6 00:00:00.0′ and leg.departure_time>=’1000′ and leg.departure_time<=’2000′ ) and
leg.origin=’AMS’ and
am.origin=’AMS’ and
am.carrier = leg.carrier and
am.flight_number = leg.flight_number and
am.flight_date = leg.flight_date
group by
leg.seq,
leg.carrier,
leg.flight_number,
leg.flight_date,
leg.departure_time,
leg.departure_date,
leg.aircraft_type,
leg.departed
order by leg.seq DESC
Two tables are used : flight_leg and awb_move, each of them with a huge number of rows
so i need to write this as optimized as possible.
I have removed the SUM and GROUP BY statements, but they don’t make much difference. Thank you for your help,
Johnny
How many rows is this query supposed to return? And what does the execution plan look like? —
Frank Kalis
Microsoft SQL Server MVP
Contributing Editor, Writer & Forum Moderatorhttp://www.sql-server-performance.com
Webmaster:http://www.insidesql.de
Try using ROW_NUMBER() function as you are using SQL 2005. http://mssqltips.com/tip.asp?tip1175
Sunil "Its nice to be Important, But its more important to be Nice"
]]>