SQL Server Performance Tuning – Index Tuning
In earlier installments of this series we looked at T-SQL Performance optimizations along with different T-SQL practices, we can now turn our attention to the second part of this series which is index Tuning
In looking into Views we will break the article into two sections :
- Basic Index Tuning using DTA (Database Tuning Advisor)
to indentify indexes that need to be reviewed and validated against the current workload of your T-SQL queries.
Advanced Index Tuning using a Query Execution plan
to find missing indexes and determine the expected percent of enhancement for expensive queries.
Basic Index Tuning using DTA
This method offers the benefit of simplicity as well as detailed reporting and a broad scope of tuning options which can include indexes, filtered indexes, indexed views and schema partitioning that it can’t be covered easily by the advanced method.
There are two ways to use the DTA :
1- DTA without workload :
- Copy all of your T-SQL Queries into one Session
- Right click Select Analyze Query in Database Tuning advisor
- Select the database first used in your workload (normally identified within the USE statement)
- Select any additional databases and tables used within your T-SQL Queries
- In Tuning Options to set up your options according to your requirements, but note that the below set of options can work efficiently for a wide variety of different workloads
I. Mark out “Limit Tuning Time” to open time specified for tuning window
II. PDS to use : select (Indexes + include Filtered indexes ) if you are using SQL Server 2008 and upwards (if you are using SQL Server 2005 you will not be shown any Filtered indexes)
III. Partitioning strategy : select No partitioning
IV. PDS to keep in database: select “Don’t keep any existing PDS”
- Start Analysis and then you will be shown the DTA recommendations and report.
2- DTA with workload :
This way of using DTA can help in capturing a live workload when you are unsure of the exact T-SQL statements and SP parameters being used, but note that it can impact negatively on the database performance especially if you do not narrow the scope of your profiler to certain filters such as users, source, CPU cost etc
- Open SQL profiler and select the profiler named “Tuning”
- Select the appropriate filter ( DB Name , user, Source , CPU cost …)
- Specify an end time for your profiler
- Save the workload output to either a file or a table
- Run the profiler
- After completion, open DTA and load the workload from either the file or table you have saved your workload
- Repeat the steps enumerated above per “DTA without Workload” starting from Step 3
Advanced Index Tuning using a Query Execution plan
Index Tuning using a Query Execution plan will give us a more granular and detailed insight into our indexes.
In order to use this method efficiently, we need more insight into the query execution plan, namely the below elements:
- Index Scans
- Table Scans
- TV Scans (Table Valued Scan)
- Indexes Scan or Seek with high I/O or CPU cost (particularly more those larger than 1%)
- Key Lookup ( Bookmark lookup) and RID (Record Identifier lookup)
Index scans can result in major performance degradation and high CPU utilization as it reproduces CXPacket waits due to insufficient indexing that cause a query to scan the data entity of tables instead of seeking the indexes. Therefore we should create the appropriate non-clustered indexes to replace all index scans by index seeks using covered compound indexes.
Covered compound indexes are normally the most efficient way for indexing since they comprehensively cover all table columns that exists within your T-SQL queries and thus can replace index scans with index seeks.
If your databawse operations are inclined more to OLTP transactions (Insert/Update/Delete) rather than Select queries then large non-clustered indexes sizes can impact negatively OLTP performance as it will increase the size of clustered index and thus the IO cost.
So how can we design covered compound indexes?
Index design comprises two major parts “Key Columns” and “Include Column” and you distribute columns between them according to their usage within the T-SQL query as follows:
- Join columns and Where conditions columns are the main ones that should be allocated in Key Columns
- Order By and Group by columns could be included in Key Columns if they are crucial for sorting, otherwise put them in Include Columns
- All relevant Select columns should be included in the Include columns part to make sure that the covered compound indexing concept is satisfied
Table scans take place in cases of a missing primary key, so you have just to create primary keys on the appropriate columns of relevant tables.
Note that an issue can arise when you add an identity column and assign it as primary key since it may contradict with insert queries haven’t a Column list definition as below:
Insert into table select * from table1
Therefore, you first have to check first all insert queries referencing targeted tables and make sure they specific columns as below:
Insert into table (Column1, Column2,.....) select Column1, Column2 from table1
TV Scans (Table Valued Scan)
This could be resolved by reference to the T-SQL enhancements explained per in the TVF and Scalar Functions article
Indexes Scans or Seeks with high I/O or CPU cost
It isn’t sufficient to replace index scans with index seeks without reference to the IO and CPU costs of indexes which can negatively impact performance. Since SQL Server 2008, we have several techniques to evaluate the value of adding indexes:
Filtered indexing to narrow down seek scope of the index to a small volume of data resulting in the least I/O cost
CREATE NONCLUSTERED INDEX [Salaries_IX2] ON [dbo].[Salaries] ( [Salary] asc ) INCLUDE ( name ,[EmployID]) where (Salary > 5000 [Salary] < 1000)
But there are some restrictive limitations for using filtered indexes such as:
I. Filtration criteria should have static values and in addition Filtered indexes which have simple operators such as ‘like’ , ‘=’,’>’ ,'<‘ are more likely to be used within the Query execution plan. Filtered Indexes with complex operators such as “not like” ..etc are less granular for Query Analyzer and thus they may not be used within the Query execution plan
II. Filtration criteria should have the same syntax as the relevant query itself .
III. Filtration criteria shouldn’t use any derived function such as Avg([salary]) , Min([salary])…etc
Data Compression and more especially page compression where a significant reduction of I/O cost can be achieved (row compressions can be better sometimes but to estimate which is optimal you can run Exec USP_Compression_Savings_sp results (Attached with Article) to show clear estimations about compression savings for each).
CREATE NONCLUSTERED INDEX [Salaries_IX2] ON [dbo].[Salaries] ( [Salary] asc ) INCLUDE ( name ,[EmployID]) with (data_compression=page)
RID Lookup and Bookmark lookup (Key lookup):
Bookmark (key) lookups are caused by missing columns from an index design which are referenced within your T-SQL Query. This results in the query execution plan isng Bookmark (key) lookups to access them from tables directly which represents an additional cost that should be addressed by adding missing columns to index design
RID (Record identifier ) lookups are caused by missing clustered indexes so the qery execution plan is trying to use RID instead of the missing clustered index and thus should create a clustered index as explained above in “Table Scan”
In addition to all of the above, indexes should be maintained periodically to keep them efficient all the time thus query execution plan can select them so here below is a list of the most important maintenance plans:
Index rebuilds to overcome index fragmentation particularly more if fragmentation exceeds 30. You can do easily do this for a group of databases by applying the attached stored procedure Rebuild Indexes SP then you can just exec it with specifying the start and the end of database IDs you need to perform index rebuilds on as below:
exec Index_Rebuild_All @start_DB_ID ,@END_DB_ID
This can be done on a daily or weekly basis according to 3 parameters:
• Data entity size of tables
• Available peak off times
• Index fragmentations percent
Update statistics particularly more for index statistics (ideally on a daily basis) and you can simply use the below SP to update all statistics on a database:
use [workshops] Exec sp_updatestats