Index Performance Audit Checklist

Does Every Table in Each Databases Have a Clustered Index?

As a rule of thumb, every table in every database should have a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases–such as an identity column, or some other column where the value is increasing–and is unique. In many cases, the primary key is the ideal column for a clustered index. 

If you have any experience with performance tuning SQL Server 6.5, you may have heard that is not a good idea to add a clustered index to a column that monotonically increases because it can cause a “hotspot” on the disk that can cause performance problems. That advice is true in SQL Server 6.5.

In SQL Server 7.0 and 2000, “hotspots” aren’t generally a problem. You would have to have over 1,000 transactions a second before a “hotspot” were to negatively affect performance. In fact, a “hotspot” can be beneficial under these circumstances because it eliminates page splits.

Here’s why. If you are inserting new rows into a table that has a clustered index as its primary key, and the key monotonically increases, these means that each INSERT will physically occur one after another on the disk. Because of this, page splits won’t occur, which in itself saves overhead. This is because SQL Server has the ability to determine if data being inserted into a table has a monotonically increasing sequence, and won’t perform page splits when this happens.

If you are inserting a lot of rows into a heap (a table without a clustered index), data is not inserted in any particular order onto data pages, whether the data is monotonically or not monotonically increasing. This results in SQL Server having to work harder (more reads) to access the data when requested from disk. On the other hand, if a clustered index is added to a table, data is inserted sequentially on data pages, and generally less disk I/O is required to retrieve the data when requested from disk.

If data is inserted into a clustered index in more or less random order, data is often inserted randomly into data pages, which is similar to the problem of inserting data into a heap, which contributes to page splits.

So again, the overall best recommendation is to add a clustered index to a column that monotonically increases (assuming there is a column that does so), for best overall performance. This is especially true if the table is subject to many INSERTS, UPDATES, and DELETES. But if a table is subject to few data modification, but to many SELECT statements, then this advice is less useful, and other options for the clustered index should be considered.

As part of your index audit, check to see if every table in your databases has an index or not. If there is no index at all, seriously consider adding a clustered index, based on the advice provided above. There is virtually no downside to adding a clustered index to a table that does not already have one.

Are Any of the Columns in Any Table Indexed More than Once?

This may sound like obvious advice, but it is more common than you think, especially if a database has been around for awhile and it has been administered by multiple DBAs. SQL Server doesn’t care if you do this, as long as the names of the indexes are different. So as you examine your table’s current indexes, check to see if any columns have unnecessary duplicate indexes. Removing them not only reduces disk space, but speeds up all data access or modification to that table.

One common example of duplicate indexes is forgetting that columns that have a primary key, or that are specified as unique, are automatically indexed, and then indexing them again under different index names.

Are There Any Indexes that are Not Being Used in Queries?

This is another obvious piece of advice, but is also a common problem, especially if the initial indexes created for the database were “guessed at” by DBAs or developers before the database went into production. Just looking at a table’s indexes won’t tell you if they index is being used or not, so identifying unused indexes is not always easy.

One of the best ways to identify unused indexes is to use the Index Tuning Wizard, which was previously discussed.

Unnecessary indexes, just like duplicate indexes, wastes disk space and contribute to less than optimal data access and modification performance.

Are the Indexes too Wide?

The wider an index, the bigger the index becomes physically, and the more work SQL Server has to perform when accessing or modifying data. Because of this, you should avoid adding indexes to very wide columns. The narrower the index, the faster it will perform. 

In addition, composite indexes that include two or more columns, also present the same potential problem. Generally, composite indexes should be avoided, if at all possible. Often, the heavy use of composite indexes in a database means that the database design is flawed.

You can’t always avoid indexing wide columns or using composite indexes, but if you think you have to use one, be sure you have carefully evaluated your choice and are confident you don’t have other options that may offer better performance.

Are Tables That are JOINed Have the Appropriate Indexes on the JOINed Columns?

In essence, the column (or columns) used in tables being JOINed should be indexed for best performance. This is straight-forward advice and fairly obvious, but auditing your indexes for optimal JOIN performance is not easy, as you must be familiar with all the JOINs being performed in your database in order to fully perform the audit.

Many people, when creating primary key/foreign key relationships (which are often used in JOINs) forget that while an index is automatically created on the primary key column(s) when it is established, an index for a foreign key is not automatically created, and must be created manually if there is to be one.

Because this is often forgotten, as part of your audit, you may want to  identify all primary key/foreign key relationships in your tables and then verify that each foreign key column has an appropriate index.

Besides this, you can also use the Index Tuning Wizard can help identifying missing JOIN indexes, but I have found that the Wizard doesn’t always identify missing indexes for JOINed tables. When it comes right down to it, unless you know the types of common JOINs being run against your database, it is tough to identify all the columns that could benefit from an appropriate index.

Are the Indexes Unique Enough to be Useful?

Just because a table has one or more indexes doesn’t mean that the SQL Server Query Analyzer will use them. Before they are used, the Query Optimizer has to consider them useful. If a column in a table is not at least 95% unique, then most likely the query optimizer will not use an available non-clustered index based on that column. Because of this, don’t add non-clustered indexes to columns that aren’t at least 95% unique. For example, a column with “yes” or “no” as the data won’t be at least 95% unique, and creating an index on that column would in essence create an index that would never be used, which we have already learned puts a drag on performance.

As part of your audit, consider “eye-balling” the data in your tables. In other words, take a look at the data residing in your tables, and take an extra look at the columns that are indexed. Generally, it is very obvious if the data in a column is selective or not. If you notice that the data is all “male or female,” “y” or “n,” and so on, then this data is not selective and any indexes created on them will be a waste of space and a hindrance to performance.

Continues…

Leave a comment

Your email address will not be published.