Query optimizer picking the wrong query plan? | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Query optimizer picking the wrong query plan?

We have a query that just recently started acting up on us, and we think it is because the query optimizer has started choosing the wrong query plan on occassion. This is the query: SELECT TOP 1 l1.ListingID
FROM Listings l1
WHERE NOT EXISTS
(SELECT TOP 1 a.auctionid
FROM auctions a
WHERE a.listingid = l1.listingid
AND a.ispaid = 1)
AND l1.UserID = 12250
AND l1.ListingIsActive = 1
AND (DATEDIFF(d, l1.AddedDate, GETDATE()) <= 120 OR DATEDIFF(d, l1.ModDate, GETDATE()) <= 120) Sometimes it will choose a plan using nested loops and clustered index scans (which is very slow). Other times it will choose a hash and clustered index scans which runs very fast. Should I try using hints to fix this problem? I’ve read that they are a last resort. Rearranging the order of the WHERE clauses worked for a few days but not anymore. Is there a better way to write this query? The listings table has ~15000 rows, the auctions table has ~47000 rows. Auto update statistics is on. The server is a dual processor XEON 2.4 Ghz with 2GB memory. There are clustered indexes on auctions.auctionid and listings.listingid. -Joe
Is this the query as its called or do you pass parameters for some of the values? Is it an sproc or run adhoc each time?
I just plugged in some values that were causing the problem. It is actually being called from ColdFusion which passes it to SQL server. All the values are dynamic and are called as bind variables (using cfqueryparam). Interestingly when I run the query in query analyzer it always picks the right query plan. I had to capture a trace off the server to figure out the query plan issue. -Joe
If bind variables are used then that is essentially equivalent to SQL variables. If you pull the query out, paste it into QA. Now instead of just plugging values into the query, declare variables and set these to your values, then plug these variables into the query. This should simulate what ColFusion is doing. You may find that updating the statistics on the two tables is what it needs… The query itself is not really the most effient though…
– a not exists is generally not good for performance…
– the query uses top 1 but no order by…?
– the following query may be of use SELECT TOP 1 l1.ListingID
FROM Listings l1
left outer join auctions a
ON a.listingid = l1.listingid
AND a.ispaid = 0
WHERE l1.UserID = 12250
AND l1.ListingIsActive = 1
AND (DATEDIFF(d, l1.AddedDate, GETDATE()) <= 120 OR DATEDIFF(d, l1.ModDate, GETDATE()) <= 120)
order by l1.ListingID — use L1.ListingID DESC to get the latest listing indexing Listings( userid, listingisactive, addeddate, moddate )
and auctions( listingid, ispaid ) Cheers
Twan

Twan,
Thanks, that query runs much faster, but I’m not sure if the logic is the same. The purpose of this query is to determine if there are any records in the listings table that do not have an associated record (related on listingid) in the auctions table. I don’t think I can do this with a join like you suggested because that would return all rows that have this relationship. Are those composite indexes or separate indexes for each column? -Joe
A check for records which do not exist in auctions table needs to be added… SELECT TOP 1 l1.ListingID
FROM Listings l1
left outer join auctions a
ON a.listingid = l1.listingid
AND a.ispaid = 0
WHERE l1.UserID = 12250
AND a.listingid IS NULL
AND l1.ListingIsActive = 1
AND (DATEDIFF(d, l1.AddedDate, GETDATE()) <= 120 OR DATEDIFF(d, l1.ModDate, GETDATE()) <= 120)
order by l1.ListingID — use L1.ListingID DESC to get the latest listing Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

The views expressed here are those of the author and no one else. There are no warranties as to the reliability or accuracy of anything presented here.
And the answer ro the original question would be it is difficult to understand why SQL Server seems to use worng plan until you understand the data distribution for the data being passed as variables. Say for one set of variables, large no. of records are affected, this would lead to a plan where Clustered Index scan may be benefecial. But if the same plan is used for variable where Clustered Index seek would have been helpful, as less no. of records are being affected, it will lead to bad performance. So the idea is to understand the data distribution and if the skew is too bad, go for recompilation of the plan at every execution using stored procedures and using WITH RECOMPILE OPTION.
Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

The views expressed here are those of the author and no one else. There are no warranties as to the reliability or accuracy of anything presented here.
So the idea is to understand the data distribution and if the skew is too bad, go for recompilation of the plan at every execution using stored procedures and using WITH RECOMPILE OPTION.
I’m not contradicting Gaurav’s tip but instead try to schedule a job to recompile the involved stored procedues rather using WITH RECOMPILE option. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
This rewritten query now works perfectly. Thanks for all your help. -Joe
Here is a response from Microsoft Support about this issue. Although the query is now fixed, it might offer some additional insight into this issue. I think you probably want to try with different indexes to make this query faster. Indexes are used to allow efficient seeks for selective (i.e. filtering out many rows) predicates. In your query you have predicates over the UserID and ListingIsActive columns for the Listings table. Is any of the following conditions such to usually filter out many rows? UserID = 12250
ListingIsActive = 1 In case you may want to add a nonclustered index over the corresponding column, or change the clustered index to be on that column. Same about the auctions table. Since the predicate in the subquery is over auctions.listingid, you may want to add a nonclustered index over that column, or change the clustered index to be on that column. Finally, the TOP 1 clause in the subquery is redundant, i.e. it will neither do harm nor improve things. I hope it helps! This posting is provided "AS IS" with no warranties, and confers no rights. —————————–
Brad M. McGehee, MVP
Webmaster
SQL-Server-Performance.Com
Hi ya, sorry have not been able to get back to the forum for a while… Ah yes the logic in my query wasn’t quite right was it… as guarav pointed out the listingid is null check needs to be added, and then also the ispaid check should see if the value is 1 rather than 0
The indexes I mentioned are composite indexes so only one on each table Cheers
Twan
This is all great advice. We will be testing with the new composite indexes under load this week. Thanks again.
]]>