Finding Duplicate Indexes in Large SQL Server Databases

Now we have listed the most obvious offenders. Remember, though, potential duplication means the same indexed columns, in the same order from left to right in the column list for the index, even in cases where the whole list of indexed columns does not match. A slightly more complex self-join of the index list will reveal pairs of potentially redundant indexes where the list of indexed columns is only partly matching:

SELECT l1.tablename,
	l1.indexname,
	l2.indexname AS overlappingIndex,
	l1.col1,
	l1.col2,
	l1.col3,
	l1.col4,
	l1.col5,
	l1.col6,
	l1.col7,
	l1.col8,
	l1.col9,
	l1.col10,
	l1.col11,
	l1.col12,
	l1.col13,
	l1.col14,
	l1.col15,
	l1.col16,
	l1.dpages,
	l1.used,
	l1.rowcnt
FROM vw_index_list l1
INNER JOIN vw_index_list l2 ON l1.tablename = l2.tablename
	AND l1.indexname <> l2.indexname
	AND l1.col1 = l2.col1
	AND (l1.col2 IS NULL OR l2.col2 IS NULL OR l1.col2 = l2.col2)
	AND (l1.col3 IS NULL OR l2.col3 IS NULL OR l1.col3 = l2.col3)
	AND (l1.col4 IS NULL OR l2.col4 IS NULL OR l1.col4 = l2.col4)
	AND (l1.col5 IS NULL OR l2.col5 IS NULL OR l1.col5 = l2.col5)
	AND (l1.col6 IS NULL OR l2.col6 IS NULL OR l1.col6 = l2.col6)
	AND (l1.col7 IS NULL OR l2.col7 IS NULL OR l1.col7 = l2.col7)
	AND (l1.col8 IS NULL OR l2.col8 IS NULL OR l1.col8 = l2.col8)
	AND (l1.col9 IS NULL OR l2.col9 IS NULL OR l1.col9 = l2.col9)
	AND (l1.col10 IS NULL OR l2.col10 IS NULL OR l1.col10 = l2.col10)
	AND (l1.col11 IS NULL OR l2.col11 IS NULL OR l1.col11 = l2.col11)
	AND (l1.col12 IS NULL OR l2.col12 IS NULL OR l1.col12 = l2.col12)
	AND (l1.col13 IS NULL OR l2.col13 IS NULL OR l1.col13 = l2.col13)
	AND (l1.col14 IS NULL OR l2.col14 IS NULL OR l1.col14 = l2.col14)
	AND (l1.col15 IS NULL OR l2.col15 IS NULL OR l1.col15 = l2.col15)
	AND (l1.col16 IS NULL OR l2.col16 IS NULL OR l1.col16 = l2.col16)
ORDER BY
	l1.tablename,
	l1.indexname

This works by locating pairs where the first indexed column is a match, but for each subsequent indexed column there is either a match, or one or both of the values is null.

In our sample case, one pair looks like:

Table Index Col1 Col2 Col3
myTable ind_name_1 LastName NULL NULL
myTable ind_name_2 LastName FirstName NULL

They match because Col1 is a match, and Col2 either matches, or one of the two values is NULL, and Col3 either matches or is NULL, and so on. (Because of the way that indexes are defined, the first time you encounter a NULL, reading across the table, it is guaranteed that all the rest of the values are NULL. Col10, for example, cannot be filled unless Col1 through Col9 are filled.)

In a pair like this, the index with the longer list of keys includes the function of the other one, making the other one redundant. The index with the shorter list of keys can probably be removed.

The resulting list from this series of queries will give you some candidate indexes to remove from the database — but do so carefully. One gotcha: there might be queries in your system that use specific INDEX query hints to force the system to process a query with one or more specific indexes. If an index you find for removal is specifically called in query hints, then the query using that hint might break.

So, as part of your overall strategy for index tuning, you can locate and remove redundant indexes:

  1. Run these queries to find candidates that might be redundant.
  2. From the candidates, verify which are required by the system because they serve a function like enforcement of a unique or primary key constraint.
  3. Check to see if any of your candidates are Clustered Indexes. You can use sp_helpindex, or the function INDEXPROPERTY( tableID, indexName, ‘IsClustered’ ) as shown in the first query in this article. Because clustered indexes actually determine the physical order in which the data in a table is stored, they might need to be retained for that purpose. A table with no clustered index is stored in essentially random order, even if it is indexed other ways, which can cause some inefficiency, and can increase the amount of space required to store other indexes on the table.
  4. Verify that the candidate indexes are not specifically identified in query hints in your other code.
  5. Test removing the redundant indexes, but record their definitions in case they need to be restored for some unforeseen reason.

Extra Credit

The first SELECT statement in this article includes both statistics objects and indexes, and not by accident. If you want to be really thorough, you can change vw_index_list to include both indexes and statistics objects and then test all the results for duplicates.

To perform that search, change vw_index_list to be defined as the first SELECT statement in this article instead of the second one. You can then use the same procedures to locate duplicate statistics objects, or statistics that were created on columns that are also indexed, making the statistics redundant, or redundant indexes, all at once. To make clear which item is which type in the subsequent queries, add the column IsStats from the revised vw_index_list, and look for a 1 or a 0 in the results.

Copyright 2005 by the author.

]]>

Leave a comment

Your email address will not be published.