General Tips on Optimizing SQL Server Indexes
All the tips provided on this website about indexing are general guidelines. As with any general guideline, there are exceptions. Because of this, it is a good idea to test out various indexing strategies for the most common queries run against your database. Only by testing different indexing strategies can you be sure that you have fully optimized your database. [6.5, 7.0, 2000, 2005] Updated 4-3-2006
If your databases are set for “Auto Create Statistics,” the SQL Server Query Optimizer, when running queries, will consider if there is any benefit to adding statistics for any column that doesn’t already have statistics for it. This is for columns without indexes. This is a good thing as it helps provide better information to the Query Optimizer so that better optimized execution plans are created to execute queries.
The addition of an automatically added statistics to a column is also useful for something else. It is a clue to the potential need for an index on the column. In other words, if the Query Optimizer thinks that column statistics are useful, there is also a good chance that adding an appropriate index to this same column would be useful. This is not always the case, so you will need to perform some testing — before and after an index is added — to see if adding an index actually helps or not. But this is simple to do.
How do you know if the Query Optimizer has automatically created column statistics on a column in a table? Actually, this is quite easy to find out. Run the following query from Query Analyzer or Management Studio, which is pointing to a user database.
WHERE (name LIKE ‘%_WA_Sys%’)
This query will return all of the columns from the tables in your database that have column statistics on them that have been added automatically by the Query Optimizer. The value that is in the “name” column of the sysindexes table is the name assigned to the statistics that SQL Server keeps track of for the named column. This information provide you a starting point from which to explore whether or not adding indexes to these columns will be useful or not. [7.0, 2000, 2005] Updated 4-3-2006
Indexes cannot be created in a vacuum. In other words, before you can identify and create optimal indexes for your tables, you must thoroughly understand the kinds of queries that will be run against them. This is not an easy task, especially if you are attempting to add indexes to a new database.
Whether you are optimizing the indexes for the first time for a new database, or for a current production database, you need to identify what queries are run, and how often they are run. Obviously, you will want to spend more time creating and tuning indexes for queries that are run very often than for queries that are seldom run. In addition, you will want to identify those queries that are the most resource intensive, even if they aren’t run the most often.
Once you know which queries run the most often, and which are the most resource intensive, you can begin to better allocate your time in order to get the biggest bang for your time invested.
But there is still one little question. How do you identify which queries are run the most often, and which are the most resource intensive? The easiest solution is to capture a Profiler trace, which can be configured to identify which queries run the most often, and to identify which queries use the most resources. How you configure Profiler won’t be discussed now, as it would take a large article to explain all the options. The point here is to make you aware that the Profiler is the tool of choice to identify the queries that are being run against your database.
If you are adding indexes to a production database, capturing the data you need is simple. But if your database is new, what you will need to do is to simulate actual activity as best as possible, perhaps during beta testing of the application, and capture this activity. While it may not be perfect data, it will at least give you a head start. And once production begins, you can continue your index tuning efforts on an on-going basis until you are relatively satisfied that you have identified and tuned the indexes to the best of your ability.
Once you have identified the key queries, your next job is to identify the best indexes for them. This is also a process too big to describe in this single tip, although there are many tips on this website that relate directly to this issue. Essentially, what you need to do is to run each query you need to analyze in Query Analyzer or Management Studio, examining how it works, and examining its execution plan. Then based on your knowledge of the query, and your knowledge of indexing and how it works in SQL Server, you begin the art and science of adding and optimizing indexes for your application. [6.5, 7.0, 2000, 2005] Updated 4-3-2006
As a rule of thumb, every table should have at least a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases — such as an identity column, or some other column where the value is increasing — and is unique. In many cases, the primary key is the ideal column for a clustered index. [7.0, 2000] Updated 4-3-2006
Indexes should be considered on all columns that are frequently accessed by the WHERE, ORDER BY, GROUP BY, TOP, and DISTINCT clauses. Without an index, each of these operations will require a table scan of your table, potentially hurting performance.
Keep in mind the word “considered.” An index created to support the speed of a particular query may not be the best index for another query on the same table. Sometimes you have to balance indexes to attain acceptable performance on all the various queries that are run against the tables of a database. [6.5, 7.0, 2000, 2005] Updated 4-3-2006
An index on a column can often be created different ways, some of which are more optimal that others. What this means that just because you create a useful index on a column doesn’t mean that it automatically is the optimum version of that index. It is quite possible that a different variation of the same index is faster.
The most obvious example of this is that an index can be a clustered or non-clustered. Another example of how an index is created that can affect its performance is the FILLFACTOR and PAD_INDEX settings used to create it. Also, whether the index is also a composite index or not (and what columns it contains) can affect an index’s performance.
Unfortunately, there is no easy answer as to which variation of the same index is the fastest in your situation, as the data and queries run against the data vary.
While I can’t offer you specific rules that fit in all cases, the index tips you find on this website should help you decide which variation of an index is best in your particular circumstance. You may also need to test variations of the same index to see which variation works best for you. [6.5, 7.0, 2000, 2005] Updated 9-4-2006
Don’t automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by the queries run against the table. [6.5, 7.0, 2000, 2005] Updated 9-4-2006
Static tables (those tables that change very little, or not at all) can be more heavily indexed that dynamic tables (those that are subject to many INSERTs, UPDATES, or DELETES) without negative effect. This doesn’t mean you should index every column. Only those columns that need an index should have them. But at least you don’t have to worry about the overhead of indexes when they are added to static tables, as you must keep in mind when adding indexes to dynamic tables.
In addition, for these tables, create the indexes with a FILLFACTOR and a PAD_INDEX of 100 to ensure there is no wasted space. This reduces disk I/O, helping to boost overall performance. [6.5, 7.0, 2000, 2005] Updated 9-4-2006
Point queries, queries than return a single row, are just as fast using a clustered index as a non-clustered index. If you will be creating an index to speed the retrieval of a single record, you may want to consider making it a non-clustered index, and saving the clustering index (you can only have one) for queries that return a range of data. [6.5, 7.0, 2000, 2005] Updated 9-4-2006
To help identify which tables in your database may need additional indexes, use the SQL Server Profiler Create Trace Wizard to run the “Identify Scans of Large Tables” trace. This trace will tell which tables are being scanned by queries instead of using an index to seek the data. This should provide you data you can use to help you identify which tables may need additional or better indexes. [7.0] Updated 9-4-2006
Don’t over index your OLTP tables, as every index you add increases the time it takes to perform INSERTS, UPDATES, and DELETES. There is a fine line drawn between having the ideal number of indexes (for SELECTs) and the ideal number to minimize the overhead that occurs with indexes during data modifications. [6.5, 7.0, 2000] Updated 9-4-2006
Don’t accidentally add the same index twice on a table. This is easier to do than you think. For example, you add a unique or primary key to a column, which of course creates an index to enforce the specified constraint. But without thinking about it when evaluating the need for indexes on a table, you decide to add a new index, and this new index happens to be on the same column as the unique or primary key. As long as you give indexes different names, SQL Server will allow you to create the same index over and over, even when it is a bad idea. [7.0, 2005, 2006] Updated 11-6-2006
Drop indexes that are not used by the Query Optimizer. Unused indexes slow data modifications, causes unnecessary I/O reads when reading pages, and wastes space in your database, and increasing the amount of time it takes to backup and restore databases. [7.0, 2000, 2005] Updated 11-6-2006