Finding Duplicate Indexes in Large SQL Server Databases

So, let’s refine the query to return only the true indexes, and while we’re at it, store this select statement as a view for use later:

CREATE VIEW vw_index_list AS
SELECT tbl.[name] AS TableName,
	idx.[name] AS IndexName,
	INDEX_COL( tbl.[name], idx.indid, 1 ) AS col1,
	INDEX_COL( tbl.[name], idx.indid, 2 ) AS col2,
	INDEX_COL( tbl.[name], idx.indid, 3 ) AS col3,
	INDEX_COL( tbl.[name], idx.indid, 4 ) AS col4,
	INDEX_COL( tbl.[name], idx.indid, 5 ) AS col5,
	INDEX_COL( tbl.[name], idx.indid, 6 ) AS col6,
	INDEX_COL( tbl.[name], idx.indid, 7 ) AS col7,
	INDEX_COL( tbl.[name], idx.indid, 8 ) AS col8,
	INDEX_COL( tbl.[name], idx.indid, 9 ) AS col9,
	INDEX_COL( tbl.[name], idx.indid, 10 ) AS col10,
	INDEX_COL( tbl.[name], idx.indid, 11 ) AS col11,
	INDEX_COL( tbl.[name], idx.indid, 12 ) AS col12,
	INDEX_COL( tbl.[name], idx.indid, 13 ) AS col13,
	INDEX_COL( tbl.[name], idx.indid, 14 ) AS col14,
	INDEX_COL( tbl.[name], idx.indid, 15 ) AS col15,
	INDEX_COL( tbl.[name], idx.indid, 16 ) AS col16,
	dpages,
	used,
	rowcnt
FROM SYSINDEXES idx
INNER JOIN SYSOBJECTS tbl ON idx.[id] = tbl.[id]
WHERE indid > 0
	AND INDEXPROPERTY( tbl.[id], idx.[name], 'IsStatistics') = 0

This will yield all the indexes, but not the table or statistics entries, from SYSINDEXES.

The next step is to create a query against that view that will reveal any duplication. The narrowest case of duplication is exactly the same indexed columns, in the same order, in more than one index. A self-join of the index list, comparing the columns being used by each index, will reveal pairs of potentially redundant indexes:

SELECT l1.tablename,
	l1.indexname,
	l2.indexname AS duplicateIndex,
	l1.col1,
	l1.col2,
	l1.col3,
	l1.col4,
	l1.col5,
	l1.col6,
	l1.col7,
	l1.col8,
	l1.col9,
	l1.col10,
	l1.col11,
	l1.col12,
	l1.col13,
	l1.col14,
	l1.col15,
	l1.col16,
	l1.dpages,
	l1.used,
	l1.rowcnt
FROM vw_index_list l1
INNER JOIN vw_index_list l2 ON l1.tablename = l2.tablename
	AND l1.indexname <> l2.indexname
	AND l1.col1 = l2.col1
	AND COALESCE(l1.col2,'') = COALESCE(l2.col2,'')
	AND COALESCE(l1.col3,'') = COALESCE(l2.col3,'')
	AND COALESCE(l1.col4,'') = COALESCE(l2.col4,'')
	AND COALESCE(l1.col5,'') = COALESCE(l2.col5,'')
	AND COALESCE(l1.col6,'') = COALESCE(l2.col6,'')
	AND COALESCE(l1.col7,'') = COALESCE(l2.col7,'')
	AND COALESCE(l1.col8,'') = COALESCE(l2.col8,'')
	AND COALESCE(l1.col9,'') = COALESCE(l2.col9,'')
	AND COALESCE(l1.col10,'') = COALESCE(l2.col10,'')
	AND COALESCE(l1.col11,'') = COALESCE(l2.col11,'')
	AND COALESCE(l1.col12,'') = COALESCE(l2.col12,'')
	AND COALESCE(l1.col13,'') = COALESCE(l2.col13,'')
	AND COALESCE(l1.col14,'') = COALESCE(l2.col14,'')
	AND COALESCE(l1.col15,'') = COALESCE(l2.col15,'')
	AND COALESCE(l1.col16,'') = COALESCE(l2.col16,'')
ORDER BY
	l1.tablename,
	l1.indexname
            

Let’s analyze how this works: each index is compared to all others in the list. Col1 cannot be NULL, because every index must reference at least one column. Col2 – Col16 might be filled or NULL, but because we cannot compare nulls for equality, we compare them using COALESCE() to replace any NULLs with an empty string ”.

That query will show any exact duplicate pairs. What you may find is that there are automatically created indexes, built by SQL Server to implement Unique or Primary Key constraints, that are duplicated by other indexes that were manually added. A pair like this might show one index named “PK_name,” which was created by virtue of there being a primary key on a table, and another index that someone created, also on the primary key column. The first cannot be removed, because it’s required for the primary key constraint, but the second can probably be dropped.

As an aside, you can investigate which indexes are created by SQL Server for constraints by looking at:

SELECT * FROM information_schema.table_constraints

and

SP_HELPINDEX 'yourTable'

and

SP_HELPCONSTRAINT 'yourTable'

or even

SELECT table_name,
	constraint_name,
	indexname,
	constraint_type
FROM information_schema.table_constraints c
INNER JOIN vw_index_list i ON c.constraint_name = i.indexname
            

You’ll find that there are paired constraints and indexes with the same names, where the index is the mechanism used to enforce the constraint.

Continues…

Leave a comment

Your email address will not be published.