The reason we want to know if an index is selective enough or not is because if it isn’t, then it won’t be used. And if an index won’t be used, there is no point in having it. Most likely, dropping unnecessary indexes can boost the performance of your application because indexes, as you probably know, slow down INSERTs, UPDATEs, and DELETEs in a table because of the overhead of maintaining indexes. And if the table in questions is subject to a high level of database changes, index maintenance can be the cause of some bottlenecks. So our goal is to ensure that if we do have indexes, that they are selective enough to be useful. We don’t want to maintain indexes that won’t be used.
The command we will use to find the selectivity of an index is:
DBCC SHOW_STATISTICS (table_name, index_name)
When this command runs, it produces an output similar to the following. This is a real result based on one of the databases I maintain.
Statistics for INDEX ‘in_tran_idx’.
Updated Rows Rows Sampled Steps Density Average key length
——————– ——- ———— —— ———— ——————
Feb 24 2001 3:36AM 7380901 7163688 300 2.2528611E-5 0.0
(1 row(s) affected)
All density Columns
(300 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
This result includes a lot of information, most of which is beyond the scope of this article. What we one to focus on is the density value “2.2528611E-5” under the “All density” column heading.
Density refers to the average percentage of duplicate rows in an index. If an indexed column, such as employeeid, has much duplicate data, then the index is said the have high density. But if an indexed column has mostly unique data, then the index is said to have low density.
Density is inversely related to selectivity. If density is a high number, then selectively is low, which means an index may not be used. If density is a low number, then selectivity is high, and an index most likely will be used.
In the sample printout above, the density for the index is less than 1%. In turn, this means the selectivity of the table is over 99%, which means that the index is probably very useful for the Query Optimizer.
If you are an advanced DBA, you will probably have already noticed that I have oversimplified this discussion. Even so, the point I want to make in this article is still very valid, and my point is, is that not all indexes are equal. Just because an index is available to the Query Optimizer does not mean it will always be used.
For DBAs, this means you need to be wary of your table’s indexes. As time permits, you may want to run the DBCC SHOW_STATISTICS command and see how selective your indexes actually are. You may find that some of your indexes are not being used. And if this is the case, you may want to consider removing them, which in turn may speed up your application.
For new DBAs, removing, rather than adding indexes may seem a backward way to performance tune your database. But the more you learn about SQL Server works internally, the better you will understand the limits of using indexes to performance tune your applications.
To read about this subject in much greater depth, see chapter 15, “The Query Processor” in the book, Inside Microsoft SQL Server 2000, by Kalen Delaney.]]>