Tips on Optimizing SQL Server Clustered Indexes
Clustered indexes are useful for queries that meet these specifications:
· For queries that SELECT by a large range of values or where you need sorted results. This is because the data is already presorted in the index for you. Examples of this include when you are using BETWEEN, <, >, GROUP BY, ORDER BY, and aggregates such as MAX, MIN, and COUNT in your queries.
· For queries that look up a record with a unique value (such as an employee number) and you need to retrieve most or all of the data in the record. This is because the query is covered by the index. In other words, the data you need is in the index itself, and SQL Server does not have to read any additional pages to retrieve the data you want.
· For queries that access columns with a limited number of distinct values, such as a columns that holds country or state data. But if column data has little distinctiveness, such as columns with a yes or no, or male or female, then won’t want to “waste” your clustered index on them.
· For queries that use the JOIN or GROUP BY clauses.
· For queries where you want to return a lot of rows, just not a few. Again, this is because the data is in the index and does not have to be looked up elsewhere. [6.5, 7.0, 2000, 2005] Updated 12-6-2005
If you run into a circumstance where you need to have a single wide index (a composite index of three or more columns) in a table, and the rest of the indexes in this table (assuming there are two or more) will only be one column wide, then consider making the wide index a clustered index and the other indexes non-clustered indexes.
Why? If the wide index is a clustered index, this means that the entire table is the index itself, and a large amount of additional disk space is not required to create the index. But if the wide index is a non-clustered index, this means SQL Server will have to create a “relatively large” index, which will indeed consume a large amount of additional disk space. Whenever the index needs to be used by the query optimizer, it will be more efficient to access the clustered index than the non-clustered index, and performance will be better. [6.5, 7.0, 2000, 2005] Updated 3-5-2004
Avoid clustered indexes on columns that are already “covered” by non-clustered indexes. A clustered index on a column that is already “covered” is redundant. Use the clustered index for columns that can better make use of it. [6.5, 7.0, 2000, 2005] Updated 3-5-2004
When selecting a column to base your clustered index on, try to avoid columns that are frequently updated. Every time that a column used for a clustered index is modified, all of the non-clustered indexes must also be updated, creating additional overhead. [6.5, 7.0, 2000, 2005] Updated 3-5-2004
When selecting a column or columns to include in your clustered index, select the column (or for the first column in a composite index) that contains the data that will most often be searched in your queries. [6.5, 7.0, 2000, 2005] Updated 3-5-2004
If a table has both a clustered index and non-clustered indexes, then performance will be best optimized if the clustered index is based on a single column that is as narrow as possible. This is because non-clustered indexes use the clustered index to locate data rows and because non-clustered indexes must hold the clustered keys within their B-tree structures. This helps to reduce not only the size of the clustered index, but all non-clustered indexes on the table as well. [6.5, 7.0, 2000, 2005] Updated 3-5-2004
The primary key you select for your table should not always be a clustered index. If you create a primary key and don’t specify otherwise, this is the default. Only make the primary key a clustered index if you will be regularly performing range queries on the primary key or need your results sorted by the primary key. [6.5, 7.0, 2000, 2005] Updated 3-5-2004
If you drop or change a clustered index, keep in mind that all of the non-clustered indexes also have to be rebuilt. Also keep in mind that to recreate a new clustered index, you will need free disk space equivalent to 1.2 times the size of the table you are working with. This space is necessary to recreate the entire table while maintaining the old table until the new table is recreated. Then the old table is deleted. [6.5, 7.0, 2000, 2005] Updated 3-5-2004