Hi there, I've got a SQL 2005 table(DataTable) with 5.5 million rows and a clustered index on an integer column. The column is a foreign key pointing to another table (LookupTable, clustered index on the integer key) that associates the integer with an nvarchar(50) column. I'm running equivalent select statements with slightly different where clauses, and am not seeing the performance similarities I'd expect. select * from DataTable join LookupTable on DataTable.Integer = LookupTable.Integer 1. Setting the where clause to the nvarchar(50) value on the LookupTable takes 64 seconds. 2. Setting the where clause to Integer = (select integer from LookupTable where nvarchar(50) = 'mysearchstring') takes 61 seconds. 3. Setting the where clause to Integer = 526 takes 1 second. Is this poor design in SQL, or is it a misconfiguration? I would have expected SQL to modify my query to essentially be #3 in all cases, before even touching the DataTable, which would cause all three versions to take 1 second. Thanks, Jason
>>>Setting the where clause to the nvarchar(50) value on the LookupTable takes 64 seconds. If you mean direct select from the lookup table, then the clustered index does not come into picture here. >>>Setting the where clause to Integer = (select integer from LookupTable where nvarchar(50) = 'mysearchstring') takes 61 seconds. The clustered index will only help with joining the Datatable with the Lookup table. The actual retrieval is done in the lookup table. So higher the number of records "common" between the tables higher is the join dataset. >>>Setting the where clause to Integer = 526 takes 1 second. This is the best scenario where the clustere index is actually used. Now, do you have any index on the varchar column? that is the key here. Add a non clustered index on the table and watch the performance improve dramatically.
I checked the fragmentation, it's OK - Which is expected, I made the table a couple hours before my original post. I also made clustered and nonclustered indexes on every column below, and it did not change the query time for the three where clauses below. Let me make the examples clearer. Here are the tables. "LookupTable" Columns - LastNameID - Primary key, Integer LastName - varchar(50) "DataTable" Column - LastNameID - Foreign key pointing to the above primary key. Here is my select statement - select * from LookupTable LookupTable join DataTable DataTable on LookupTable.LastNameID = DataTable.LastNameID Here are the various where clauses that are surprising me 1. Where LookupTable.LastNameID = 526, takes 1 second. 2. Where LookupTable.LastName = 'Smith', takes 61 seconds. 3. Where LookupTable.LastNameID in (select LastNameID from LookupTable where LastName = 'Smith'), takes 64 seconds. It seems like SQL is not very intelligent in running queries #2 and #3. For example, for #2 why doesn't SQL do this - a. Select LastNameID from LookupTable where LastName = 'Smith'. This takes 0 seconds. The LastNameID is 526. b. Select * from LookupTable LookupTable join DataTable Datable on LookupTable.LastNameID = DataTable.LastNameID where LookupTable.LastNameID = 526. This is where clause #1, and takes 1 second. I would have expected SQL to take similar steps for where clause #3. Am I expecting execution intelligence that does not exist?
[quote user="jdblack"] Let me make the examples clearer. Here are the tables. "LookupTable" Columns - LastNameID - Primary key, Integer LastName - varchar(50) "DataTable" Column - LastNameID - Foreign key pointing to the above primary key. Here is my select statement - select * from LookupTable LookupTable join DataTable DataTable on LookupTable.LastNameID = DataTable.LastNameID Here are the various where clauses that are surprising me 1. Where LookupTable.LastNameID = 526, takes 1 second. 2. Where LookupTable.LastName = 'Smith', takes 61 seconds. 3. Where LookupTable.LastNameID in (select LastNameID from LookupTable where LastName = 'Smith'), takes 64 seconds. [/quote] How many rows are supposed to be returned by #2? The same single row as in #1? You write that DataTable.LastNameID is the foreign key pointing to the lookup table. By default creating such a relationship in SQL Server does not imply that an index on that DataTable.LastNameID column is created. Have you created such an index manually? Also, what do the execution plans look like? Since you are using SQL Server 2005 you might also want to have a look at the INCLUDE clause described in CREATE INDEX in BOL.
Yes, the where clauses are equivalent and return the same number of rows. The index was created manually. Adding LastNameID > 0 has no effect. Thanks, Jason
Now it is becoming intereting!!! I think LastName column has no indexes.. If yes, then please try to create one non-clustered index on it and if no..Then is this non-clustered index created before the clustered one? But if this answer is also no..then drop this non-clustered index and recreate it. Please try it once and let me know the output.Sorry..still now unable to point out your problem.
Please go for a quick check: Select LastNameID from LookupTable where LastName = 'Smith' and LastNameID > 0 Is this qury also taking same time?