General Tips on Optimizing SQL Server Indexes – Page 2

Generally, you probably won’t want to add an index to a table under these conditions:

If the index is not used by the query optimizer. Use Query Analyzer’s “Show Execution Plan” option to see if your queries against a particular table use an index or not.
If the table is small, most likely indexes will not be used.
If the column values exhibit low selectivity, often less than 90%-95% for non-clustered indexes.
If the column(s) to be indexed are very wide.
If the column(s) are defined as TEXT, NTEXT, or IMAGE data types.
If the table is rarely queried.

Creating an index under any of these conditions will most likely result in an index that is rarely, or not used at all. [6.5, 7.0, 2000, 2005] Updated 11-6-2006

While high index selectivity is generally an important factor that the Query Optimizer uses to determine whether or not to use an index, there is one special case where indexes with low selectivity can be useful speeding up SQL Server. This is the case for indexes on foreign keys.

Whether an index on a foreign key has either high or low selectivity, an index on a foreign key can be used by the Query Optimizer to perform a merge join on the tables in question. A merge join occurs when a row from each table is taken and then they are compared to see if they match the specified join criteria. If the tables being joined have appropriate indexes (no matter the selectivity), a merge join can be performed, which is generally much faster than a join to a table with a foreign key that does not have an index. [7.0, 2000, 2005] Updated 11-6-2006

*****

On data warehousing databases, which are essentially read-only, having as many indexes as necessary for covering virtually any query is not normally a problem. [6.5, 7.0, 2000, 2005] Updated 8-23-2004

*****

To provide the up-to-date statistics the query optimizer needs to make smart query optimization decisions, you will generally want to leave the “Auto Update Statistics” database option on. This helps to ensure that the optimizer statistics are valid, helping to ensure that queries are properly optimized when they are run.

But this option is not a panacea. When a SQL Server database is under very heavy load, sometimes the auto update statistics feature can update the statistics at inappropriate times, such as the busiest time of the day. This can cause blocking, reducing performance.

If you find that the auto update statistics feature is running at inappropriate times, you may want to turn it off, and then manually update the statistics (using UPDATE STATISTICS or sp_updatestats) when the database is under a less heavy load.

But again, consider what will happen if you do turn off the auto update statistics feature? While turning this feature off may reduce some stress on your server by not running at inappropriate times of the day, it could also cause some of your queries not to be properly optimized, which could also put extra stress on your server during busy times.

Like many optimization issues, you will probably need to experiment to see if turning this option on or off is more effective for your environment. But as a rule of thumb, if your server is not maxed out, then leaving this option on is probably the best decision. [7.0, 2000] Updated 8-23-2004

*****

Keep the “width” of your indexes as narrow as possible. This reduces the size of the index and reduces the number of disk I/O reads required to read the index, boosting performance. [6.5, 7.0, 2000, 2005] Updated 8-23-2004

*****

If possible, try to create indexes on columns that have integer values instead of characters. Integer values use less overhead than character values. [6.5, 7.0, 2000] Updated 8-23-2004

*****

If you have two or more tables that are frequently joined together, then the columns used for the joins should have an appropriate index. If the columns used for the joins are not naturally compact, then considering adding surrogate keys to the tables that are compact in order to reduce the size of the keys, thus decreasing I/O during the join process, which increases overall performance. [6.5, 7.0, 2000, 2005] Updated 8-23-2004

*****

When creating indexes, try to make them unique indexes if at all possible. SQL Server can often search through a unique index faster than a non-unique index because in a unique index, each row is unique, and once the needed record is found, SQL Server doesn’t have to look any further. [6.5, 7.0, 2000, 2005] Updated 8-23-2004

*****

If a particular query against a table is run infrequently, and the addition of an index greatly speeds the performance of the query, but the performance of INSERTS, UPDATES, and DELETES is negatively affected by the addition of the index, consider creating the index for the table for the duration of when the query is run, then dropping the index. An example of this is when monthly reports are run at the end of the month on an OLTP application. [6.5, 7.0, 2000, 2005] Updated 8-23-2004

*****

If you like to get under the cover of SQL Server to learn more about indexing, take a look at the sysindex system table that is found in every database. Here, you can find a wealth of information on the indexes and tables in your database. To view the data in this table, run this query from the database you are interested in:

SELECT *
FROM sysindexes

Here are some of the more interesting fields found in this table:

dpages: If the indid value is 0 or 1, then dpages is the count of the data pages used for the index. If the indid is 255, then dpages equals zero. In all other cases, dpages is the count of the non-clustered index pages used in the index.
id: Refers to the id of the table this index belongs to.
indid: This column indicates the type of index. For example, 1 is for a clustered table, a value greater than 1 is for a non-clustered index, and a 255 indicates that the table has text or image data.
OrigFillFactor: This is the original fillfactor used when the index was first created, but it is not maintained over time.
statversion: Tracks the number of times that statistics have been updated.
status: 2 = unique index, 16 = clustered index, 64 = index allows duplicate rows, 2048 = the index is used to enforce the Primary Key constraint, 4096 = the index is used to enforce the Unique constraint. These values are additive, and the value you see in this column may be a sum of two or more of these options.
used: If the indid value is 0 or 1, then used is the number of total pages used for all index and table data. If indid is 255, used is the number of pages for text or image data. In all other cases, used is the number of pages in the index.

[7.0, 2000] Updated 8-23-2004

*****

Avoid using FLOAT or REAL data types as primary keys, as they add unnecessary overhead that can hurt performance. [6.5, 7.0, 2000, 2005] Updated 8-23-2004

*****

If you want to boost the performance of a query that includes an AND operator in the WHERE clause, consider the following:

Of the search criterions in the WHERE clause, at least one of them should be based on a highly selective column that has an index.
If at least one of the search criterions in the WHERE clause is not highly selective, consider adding indexes to all of the columns referenced in the WHERE clause.
If none of the columns in the WHERE clause are selective enough to use an index on their own, consider creating a covering index for this query.

[7.0, 2000, 2005] Updated 8-23-2004

*****

The Query Optimizer will always perform a table scan or a clustered index scan on a table if the WHERE clause in the query contains an OR operator and if any of the referenced columns in the OR clause are not indexed (or does not have a useful index). Because of this, if you use many queries with OR clauses, you will want to ensure that each referenced column in the WHERE clause has an index. [7.0, 2000, 2005] Updated 8-23-2004

*****

A query with one or more OR clauses can sometimes be rewritten as a series of queries that are combined with a UNION statement in order to boost the performance of the query. For example, let’s take a look at the following query:

SELECT employeeID, firstname, lastname
FROM names
WHERE dept = ‘prod’ or city = ‘Orlando’ or division = ‘food’

This query has three separate conditions in the WHERE clause. In order for this query to use an index, then there must be an index on all three columns found in the WHERE clause.

This same query can be written using UNION instead of OR, like this example:

SELECT employeeID, firstname, lastname FROM names WHERE dept = ‘prod’
UNION ALL
SELECT employeeID, firstname, lastname FROM names WHERE city = ‘Orlando’
UNION ALL
SELECT employeeID, firstname, lastname FROM names WHERE division = ‘food’

Each of these queries will produce the same results. If there is only an index on dept, but not the other columns in the WHERE clause, then the first version will not use any index and a table scan must be performed. But in the second version of the query will use the index for part of the query, but not for all of the query.

Admittedly, this is a very simple example, but even so, it does demonstrate how rewriting a query can affect whether or not an index is used or not. If this query was much more complex, then the approach of using UNION might be most more efficient, as it allows you to tune each part of the index individually, something that cannot be done if you use only ORs in your query.

If you have a query that uses ORs and it not making the best use of indexes, consider rewriting it as a UNION, and then testing performance. Only through testing can you be sure that one version of your query will be faster than another. [7.0, 2000, 2005] Updated 10-4-2004

*****

The Query Optimizer converts the Transact-SQL IN clause to the OR operator when parsing your code. Because of this, keep in mind that if the referenced column in your query doesn’t include an index, then the Query Optimizer will perform a table scan or clustered index scan on the table. [6.5, 7.0, 2000, 2005] Updated 10-4-2004

*****

If you use the SOUNDEX function against a table column in a WHERE clause, the Query Optimizer will ignore any available indexes and perform a table scan. If your table is large, this can present a major performance problem. If you need to perform SOUNDEX type searches, one way around this problem is to pre-calculate the SOUNDEX code for the column you are searching and then place this value in a column of its own, and then place an index on this column in order to speed searches. [6.5, 7.0, 2000] Updated 10-4-2004

*****

If you need to create indexes on large tables in SQL Server 2000, you may be able to speed up their creation by using the SORT_IN_TEMPDB option available with the CREATE INDEX command. This option tells SQL Server to use the tempdb database, instead of the current database, to sort data while creating indexes.

Assuming your tempdb database is isolated on its own separate disk or disk array, then the process of creating the index may be sped up.

The only slight downside to using this option is that it takes up slightly more disk space than if you didn’t use it, but this shouldn’t be much of an issue in most cases. If your tempdb database is not on its own disk or disk array, then don’t use this option, as it can actually slow performance. [2000] Updated 10-4-2004

*****

SQL Server 2000 Enterprise Edition (not the standard edition) offers the ability to create indexes in parallel, greatly speeding index creation. Assuming your server has multiple CPUs, SQL Server 2000 uses near-linear scaling to boost index creation speed. For example, using two CPUs instead of one CPU almost halves the speed it takes to create indexes. [2000] Updated 10-4-2004

Continues…

Leave a comment Cancel reply