I've got a test "User" table with 3.3 million rows: Id uniqueidentifier primary key clustered not nullOrganizationId uniqueidentifier not nullField1 integer null...Field20 integer null Field1 through Field20 represent 20 completely independent values that I may need to search on...or not. The search is always in the context of a single Organization, so I've got 20 separate (nonclustered) indexes: (OrganizationId, Field1)...(OrganizationId, Field20)I updated statistics after populating the table. As expected, this query takes no time: select count(*) from UserwhereOrganizationId = 'DEF88071-EF77-4BBD-B650-D1AB22A4B0F2'and Field1 = 70Instantly (0:00) returns a count of 80,808 rows And as expected, the same query with this WHERE clause takes no time: OrganizationId = 'DEF88071-EF77-4BBD-B650-D1AB22A4B0F2'and Field6 = 4699Instantly (0:00) returns a count of 22,500 rows But running the same query on TWO of these fields... OrganizationId = 'DEF88071-EF77-4BBD-B650-D1AB22A4B0F2'and Field1 = 70and Field6 = 4699...was extremely slow: 4:25 to return a count of 21,781 If I was always querying on some known combination of fields I could create a covering index. But in this case these 20 fields are completely independent, so each search could be include any random combination of fields (just depends on which fields the user chooses to filter on). Looking at the execution plans: The first 2 queries each do an Index Seek, Stream Aggregate, Compute Scalar. The third query does an Index Seek (on Field6 which has the smaller row count of 22,500), then a Bookmark Lookup of those 22,500 rows with a Filter (presumably for Field1=70), then Stream Aggregate, and Compute Scalar. On the third query the "cost" shows 98% on Index Seek and 1% for Bookmark Lookup, but I'm betting the Bookmark Lookup is the time-consumer...essentially doing a scan of 22,500 rows to see which ones had Field1=70. I tried this WHERE clause (pseudo-code): OrgId = 'X' and Field1 = 70and Id in (select Id from User where OrgId='X' and Field6 = 4699)And it executed in 1 second: the execution plan showed the 2 separate Index Seeks feeding into Nested Loops (Inner Join), then Stream Aggregate and Compute Scalar. Then I reversed the 2 fields: OrgId = 'X' and Field6 = 4699and Id in (select Id from User where OrgId='X' and Field1 = 70)It also executed in 1 second...similar plan except it showed Nested Loops (Left Semi Join). So I guess I have "a" solution, but I'm wondering... Why such a performance hit? Is there a better approach to the query...perhaps an optimizer hint to force Nested Loops instead of Bookmark Lookup? Is there a different database design that would better support these kind of searches I need to perform? (Basically trying to find all Users who match some random filtering criteria, where each filtering option is an integer value.) Thanks for any insights/suggestions! --Dave
Today I realized the indexes on (OrganizationId, Field#) were redundant because the values stored in Field# were already "owned" by a specific organization. Replacing those indexes with separate indexes for each Field helped the optimizer a lot. It seems to look at all the various fields I'm filtering by, then generally picks the 2 indexes with the fewest hits, then joins them together to find the PKs that are in common, then does a key lookup (using the clustered Id) to filter. In other cases where I'm intentionally searching on just a handful of fields which aren't very distinct, I've seen it do index seeks on a couple tables, Join, then do another index seek and Join that, and finally does a key lookup. Thanks, --Dave