Identify Missing Indexes Using SQL Server DMVs
Performance tuning is one of the major that database administrators need to perform when supporting very large databases. Analysis of present index usage is an initial stage of performance tuning. In the previous article titled Are Your Indexes Being Used Effectively?, how to check whether all the indexes on a table are used effectively or not was explained. The next task is how to analyse and fix index fragmentations, covered in the previous article titled Analyze and Fix Index Fragmentation in SQL Server 2008.
It has been always challenging for database administrators to identity indexes that are missing on a table. In SQL Server 2005, Microsoft introduced Dynamic Management Views (DMVs). The role of DMVs is to return SQL Server state information; which can be used by database administrators and database developers to monitor the health of an SQL Server Instance and identify potential performance issues. DMVs reflect all the activities on the instance of SQL Server since the last restart. Unfortunately in SQL Server editions prior to SQL Server 2005 there is no easy way to identify missing indexes on a table. In SQL Server 2000 the only way to identify if an index needs to be created is to capture a workload in SQL Profiler and then run it against the Index Tuning Wizard. However, in order to access the information stored in DMVs you need to have VIEW SERVER STATE permissions. Only members of the sysadmin fixed server role can grant VIEW SERVER STATE permissions to other users. In this article example demonstrating how to identify missing indexes on a table is provided.
The Dynamic Management Views (DMV) which can be used to identify missing indexes on a table are:
Below is an example explaining how to analyse missing indexes on a table using Dynamic Management Views. This example uses the Sales.Store table which is available in AdventureWorks database.
Identify Existing Indexes on a Table
Execute the below TSQL to identify existing indexes on Sales.Store table.
Sample User Query Executed against Sales.Store Table of AdventureWorks
Next step will be to execute the below mentioned TSQL code against AdventureWorks database.
SELECT Name,CustomerID,ModifiedDate FROM Sales.Store WHERE (Name=’Sharp Bikes’
and CustomerID <> ” AND ModifiedDate > ’2004-10-01′)
Once the above query is run, the relevant information related to missing index will be gathered by dynamic management views. Now to examine the information that is gathered by missing index DMV’s one by one.
Analysis Information Captured by sys.dm_db_missing_index_details DMV
Analyse the information gathered by sys.dm_db_missing_index_details DMV by executing the TSQL code shown below.
SELECT * FROM sys.dm_db_missing_index_details
The important columns to look are equality_columns, inequality_columns, included_columns and statement. Below is a breif explanation of the valuable information which these columns contain.
- equality_columns:- The equality_columns = [Name] means within the sample user query [Name] column was used in there WHERE clause and with an equal operator, SQL Server query optimizer is informing us that [Name] column is a good candidate for creating a new index.
- inequality_columns:- The inequality_columns = [CustomerID] and [ModifiedDate] means within the sample user query [CustomerID] and [ModifiedDate] columns were used in the WHERE clause, however they used operators other than the equal operator. In the sample query [CustomerID] was using “<>” NOT equal operator and [ModifiedDate] is using “>” greater than operator.
- included_columns:- The included_columns = “NULL” means when creating a new index on [Name] column, there is no need to add any other column of the Sales.Store table as an included column. If there are values displayed then those columns can be added as included columns when creating an index; this will cover the query thereby improving performance.
- statement:- This column has the value [AdventureWorks].[Sales].[Store] which informs DBA against which table the query was executed.
Analysis Information Captured by sys.dm_db_missing_index_group_stats DMV
Below is an analysis of the information gathered by sys.dm_db_missing_index_group_stats DMV after executing the TSQL query:
SELECT * FROM sys.dm_db_missing_index_group_stats
The important columns to focus on are unique_compiles, user_seeks, last_user_seek, avg_total_user_cost and avg_user_impact. A brief explanation of the information contained in each column follows:
- unique_compiles:- The value 1 indicates that the query has been compiled once after the SQL Server was restarted, meaning there was no recompilation by the sample user query even though the query was executed 21 times (value obtained from user_seeks).
- user_seeks:- The value 21 represents that the user query was executed 21 times, once the SQL Server was started.
- last_user_seek:- This column contains data and time data from when the user query was last executed.
- avg_total_user_cost:- This column represent the average total user cost each time when the user query was executed.
- avg_user_impact:- This column represents the value in percentages. It informs us the amount of improvement which you can get if the index is created.
Analysis Information Captured by sys.dm_db_missing_index_groups DMV
Below is an screenshot of the information gathered by sys.dm_db_missing_index_groups DMV following execution of the TSQL query:
SELECT * FROM sys.dm_db_missing_index_groups
Analysis Information Captured by sys.dm_db_missing_index_columns DMV
Below is an analysis of the information gathered by sys.dm_db_missing_index_columns DMV following execution the TSQL query:
SELECT * FROM sys.dm_db_missing_index_columns(1)
The important columns to pay attention to are column_name and column_usage. A brief explanation of the information each column contains follows:
- column_name:- This column represent the name of the column within the user table, in our example the table used is Sales.Store of AdventureWorks database and the columns used are Name, CstomerID and ModifiedDate.
- column_usage:- This column represents how the user column is used within the user query. It displays which operator was used against each of the columns.
Query to Identifying Missing Indexes
The below query can be used to identify all the missing indexes on a user table. This query uses all the missing index DMV’s and provides a consolidated result. However execute the below query only when the server has cached most of the query plans. This is very important to get a clear result of the missing indexes. SQL Server can store upto a maximum of 500 missing index information once the SQL Server is started within the DMV’s. The missing index feature turned on by default in an SQL Server instance.
statement AS [database.scheme.table],
column_id , column_name, column_usage,
FROM sys.dm_db_missing_index_details AS mid
CROSS APPLY sys.dm_db_missing_index_columns (mid.index_handle)
INNER JOIN sys.dm_db_missing_index_groups AS mig
ON mig.index_handle = mid.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats AS migs
ORDER BY mig.index_group_handle, mig.index_handle, column_id
The next step is to create a nonclustered index for name column of Sales.Store table of AdventureWorks database as suggested by missing index DMV’s.
Create Non Clustered Index for Name Column
Execute the below TSQL code to create a non clustered index on Sales.Store table.
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N’[Sales].[Store]‘) AND name = N’IX_Store_Name’)
DROP INDEX [IX_Store_Name] ON [Sales].[Store] WITH ( ONLINE = OFF )
CREATE NONCLUSTERED INDEX [IX_Store_Name] ON [Sales].[Store]
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
Once the nonclustered index is created, the information that was available in the DMV has disappeared, shown in the snippet below.
Disadvantage of having too many indexes on a table
- Insert, Update and Delete operations will become very slow if there are many indexes created on a table. This occurs when the Insert, Update or a Delete operation against a table results in all the indexes being updated, reducing query performance.
- Indexes are stored on disk and the amount of space required by the index depends on the size of database table, and the number and type of columns used within the index definition.
- A greater number of indexes result in more disk space being required to store them.
Limitations of Missing Index Feature
- A DMV can store information from a maximum of 500 missing indexes.
- Unable to provide recommendations for clustered, indexed views and partitioning.
- Once the SQL Server is restarted all the information related to missing indexes is lost. To keep the information for later use, the DBA needs to backup all the data available within all the missing index DMV prior to the restart of SQL Server.
Database Administrators need to analyse the impact of an index created for Insert, Update and Delete operations before accepting recommendations given by the missing index DMVs.