Execution Plan on Sum Query | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Execution Plan on Sum Query

I have a simple SUM query: SELECT SUM(order_amt)
FROM order_line
WHERE order_date between ‘4/1/04’ and ‘5/1/04’ This query hits about 1.4 million rows out of a 48 million row table (approx 3% of the table). I ran the Execution Plan for this query and it performs a clustered index scan. The clustered index (ord_id, retail_loc) does not contain any columns used in this query. There is a nonclustered index on order_date though. I thought the selectivity of the query would have resulted in the Execution Plan using the nonclustered index. If I force it with a hint, it will do the nonclustered index seek with a bookmarp lookup. Still not optimal – I will probably add a nonclustered index on the order_amt to make it covering and get rid of the bookmark lookup. I’m still confused as to why the optimizer chooses to peform a clustered index scan? Thanks
Have you updated the statistics on the order_date index? If they are very out of date, this can sometimes mean SQL will feel using it will get no performance.
Try updating the statistics and check the plan again.
Yes – I updated statistics before checking out the execution plans. I ran the statistics with a sample of 20 percent – maybe thats too low? (I will probably end up creating a nonclustered composite index on order_date, order_amt)
Just put a nonclustered index on order_amt. Also have you tried recompiling the query? MeanOldDBA
[email protected] When life gives you a lemon, fire the DBA.
I would go with a covering index on (order_date, order_amt) Note: Edited from hovering to ‘a covering’ just in case somebody would be sent wondering.
In terms of recompiling – I wasn’t running the statement in a stored procedure but will give a try. I tried to clear the proc cache but do not have the appropriate privileges (on QA). Each time I’ve been running the query, I have changed the range values – is that enough of a change to force a new execution plan? And each time I’ve changed the indexing I’ve updated the stats. By adding a new non clustered index on order_amt, the plan performs a hash match. An index seek on the order_date generates the hash table, and an index scan on order_amt does the probing. This runs in about 1.5 mins. By using a composite non clustered index on order_date, order_amt, the plan does an index seek on order_date and runs in less than 3 secs. I’m going to confirm with the users that performing any sort of aggregation on order_amt is only relevant within the context of a date (or at least 95% of the time – no telling what people will query). If I have their buy in – will go ahead with the composite.
The composite is definitely the way to go. On changing the range values to force a new execution plan it depends. It will sometimes force a new plan; however, there is a complicated cost threshold ratio to determine if it’s sufficient or not. MeanOldDBA
[email protected] When life gives you a lemon, fire the DBA.
depending on a bunch of stuff,
if the estimated row count is more than 1/3 the number of pages,
it will probably use a table scan instead index seek with bookmark lookup (NC index on order_date only),