We encountered a problem in a select statement. Beneath we describe a piece of T-SQL to retreive 5050 rows out of the table tblStk0Prestatie -- ------------------------------------------------------------------------- declare @WatchStopTime1 datetime declare @WatchStopTime2 datetime declare @WatchStopTime3 datetime declare @WatchStopTime4 datetime set @WatchStopTime1 = getdate() -- -- table tblStk0Prestatie contains 190.000 rows equally distributed ovre 3 months -- --Primary key = Network_Nr, Prestatie_Nr ( both integer ) - Clustered Index -- --Index = IX_tblStk0Prestatie is an index on PrestatieDatum -- -- -- Select 1 : Select all rows for a given day - forced to use the index IX_tblStk0Prestatie which is an index on 'PrestatieDatum' -- -- SELECT Network_Nr, Prestatie_Nr, PrestatieDatum, LocLst_Nr, Negatief, Van, Tot, Loc_Nr, KindEntiteit_Network_Nr, KindEntiteit_Entiteit_Nr FROM dbo.tblStk0Prestatie with (INDEX(IX_tblStk0Prestatie)) WHERE (PrestatieDatum = CONVERT(DATETIME, '2006-02-21 00:00:00', 102)) set @WatchStopTime2 = getdate() -- -- Select 2 : Select all rows for a given day - without forcing to use the index IX_tblStk0Prestatie -- -- SELECT Network_Nr, Prestatie_Nr, PrestatieDatum, LocLst_Nr, Negatief, Van, Tot, Loc_Nr, KindEntiteit_Network_Nr, KindEntiteit_Entiteit_Nr FROM dbo.tblStk0Prestatie WHERE (PrestatieDatum = CONVERT(DATETIME, '2006-02-21 00:00:00', 102)) set @WatchStopTime3 = getdate() -- -- Select 3 : Select all rows for a given day - without forcing to use the index IX_tblStk0Prestatie -- --but less columns are used -- -- SELECT Network_Nr, Prestatie_Nr, PrestatieDatum FROM dbo.tblStk0Prestatie WHERE (PrestatieDatum = CONVERT(DATETIME, '2006-02-21 00:00:00', 102)) set @WatchStopTime4 = getdate() Print 'Select 1 : Select all rows for a given day - forced to use the index IX_tblStk0Prestatie which is an index on ''PrestatieDatum''' print datediff(ms, @WatchStopTime1, @WatchStopTime2) Print 'Select 2 : Select all rows for a given day - without forcing to use the index IX_tblStk0Prestatie ' print datediff(ms, @WatchStopTime2, @WatchStopTime3) Print 'Select 3 : Select all rows for a given day - without forcing to use the index IX_tblStk0Prestatie' print datediff(ms, @WatchStopTime3, @WatchStopTime4) --------------------------------------------------------------------- This results in the following numbers : Select 1 : Select all rows for a given day - forced to use the index IX_tblStk0Prestatie which is an index on 'PrestatieDatum' = 110 milliseconds = Execution plan : 1. select uses the index IX_tblStk0Prestatie ( Index Seek ) 2. a bookmark lookup 3. a select Select 2 : Select all rows for a given day - without forcing to use the index IX_tblStk0Prestatie = 280 milliseconds = Execution plan : 1. select uses 'clustered index scan' 2. a select [?] Which index is used and why ( the only clustered index is an index on the primary keys ( Network_nr, Prestatie_Nr ) ) [?] And a lousy performance Select 3 : Select all rows for a given day - without forcing to use the index IX_tblStk0Prestatie but with only 3 columns = 16 milliseconds = Execution plan : 1. select uses the index IX_tblStk0Prestatie ( Index Seek ) 2. Select [?] Why this increased performance ?
HI ya, sounds like the statistics need to be updated for that table... the second is used, because based on the stats sql thinks there are too many rows to retrieve, making the bookmark lookup too expensive (eventhough in reality it isn't) the last one is faster because the non-clustered index always includes all the fields from the clustered index at it's leaf level. So SQL doesn't have to read the underlying table at all... I'd also remove the convert since '2006-02-21 00:00:00' will always be a valid date anyway, but ought to not make a significant difference in performance Cheers Twan
Nr 3 is asking for columns which are all available in indexes - Network_Nr, Prestatie_Nr are your clustered PK, PrestatieDatum has an index of its own, so all the data that you're asking for can be read from the index - which gives you the best performance. Nr 1 and 2 are looking up some columns that are not covered by the index, so they would both add a bookmark lookup to refer to the actual row in the table where the data can be found - but since you already have read the data in Nr 1, Nr 2 is reading the data from cache, so it can skip the bookmark lookup. Nr 3 probably also benefits fom the cached data. Nr 2 is using the PK index: it says "clustered index scan" in the plan ... Like Twan is saying, you should check if there's an improvement when you change the WHERE clause to ... WHERE PrestatieDatum = '2006-02-21'.
quote:Originally posted by Adriaan Nr 2 is using the PK index: it says "clustered index scan" in the plan ... Like Twan is saying, you should check if there's an improvement when you change the WHERE clause to ... WHERE PrestatieDatum = '2006-02-21'. Thanks for replying to my questions, but still I don't understand why sql uses the PK index since it doesn't include any column that has something to do with 'PrestatieDatum' and there is a better alternative key IX_tblStk0Prestatie ( index on PrestatieDatum )
Well, it has to go through the PK index to even find the records. If your table had an identity column and you included that in the select list, then perhaps the query could bypass the PK. Create a table, same design as in your existing table, but without a PK. Now copy the data over and run the same query against this table. See?