Guide to SQL Server Table Indexes – Part 3
In the first part of this article series, you learned about the basics of SQL Server database table’s indexes, the difference between the clustered and non-clustered indexes, and how the leaf nodes, non-leaf nodes, and heaps associated with data storage. In the second part, you learned about the different types of indexes available in SQL Server, and some of these indexes advantages and dis-advantages.
In this part, you will learn about remaining index types available in SQL Server, and what are these indexes advantages and dis-advantages.
Partitioned indexes are indexes created on partitioned tables. Partitioned indexes are partitioned on the same horizontal filter and on the same ranges as the table on which they are based. You specify the table partition scheme, how the table is partitioned, when creating partitioned indexes. You can also create partition index on existing non-partition table, but for this, you first have to convert the existing non-partitioned table into the partition table. To do this you first need to add appropriate partition filegroups, then create partition function and partition scheme inside database. Once done, rebuild the desired table indexes/indexes on this partition.
The general syntax for creating the partitioned index is as follow:
CREATE [UNIQUE] [CLUSTERED|NONCLUSTERED] INDEX index_name ON [[database.]schema.]table|view (column [ASC | DESC] , …) INCLUDE (column_list) [WITH [index_options]][ON partition_scheme]
Partitioned indexes help optimize queries that contain only data from a single partition. Partitioning an index can also optimize index management because you can rebuild only a single index of a partition.
When you create partitioned index, keep in mind the following requirements:
- Unique Index (clustered or non-clustered) – You must add the partitioning column to the key.
- Clustered non-unique index – SQL Server adds the partitioning column to the key.
- Non-clustered non-unique index – SQL Server adds the partitioning column as an included column.
- You can only base your partition on a single column.
- You can’t use the following data types as the partitioning column: timestamp, ntext, text, image, xml, varchar(max), nvarchar(max), varbinary(max), alias, and CLR user-defined types.
For detailed discussion of partitioned tables and indexes, see Partitioned Tables and Indexes.
Beginning with SQL Server 2008, Microsoft introduced the new type of non-clustered known as filtered index. Filtered index is an optimized non-clustered index that only contains the specified subset of data, specified by the filter predicate. Filtered indexes are especially useful to cover those queries that frequently needs access to well-defined subset of data. Having a well-designed filtered index can improve query performance, reducing the overall index maintenance costs, and index storage costs compared with full-table indexes.
For example, examine the execution plan of the following query with following covering index (IXNC_SalesOrderDetail_ModifiedDate) that is frequently executed, returning all orders from Sales.SalesOrderDetail that are placed on or after January, 1st 2008:
SELECT [SalesOrderID] ,[SalesOrderDetailID] ,[OrderQty] ,[ProductID] ,[SpecialOfferID] ,[UnitPrice] ,[UnitPriceDiscount] ,[LineTotal] ,[ModifiedDate] FROM [AdventureWorks2012].[Sales].[SalesOrderDetail]
CREATE NONCLUSTERED INDEX IXNC_SalesOrderDetail_ModifiedDate ON [Sales].[SalesOrderDetail] ([ModifiedDate]) INCLUDE ([SalesOrderID] ,[SalesOrderDetailID] ,[OrderQty] ,[ProductID] ,[SpecialOfferID] ,[UnitPrice] ,[UnitPriceDiscount] ,[LineTotal])
As you can see from above, it’s doing non-clustered index seek on index IXNC_SalesOrderDetail_ModifiedDate to return the data, however, we can still reduce the query execution time and improve its performance significantly by converting the above covering index to following filtered index:
CREATE NONCLUSTERED INDEX IXNC_SalesOrderDetail_ModifiedDate ON [Sales].[SalesOrderDetail] ([ModifiedDate]) INCLUDE ([SalesOrderID] ,[SalesOrderDetailID] ,[OrderQty] ,[ProductID] ,[SpecialOfferID] ,[UnitPrice] ,[UnitPriceDiscount] ,[LineTotal]) WHERE [ModifiedDate] >= '2008-01-01 00:00:00.000'
This is because our search is on ModifiedDate that are on or after January, 1st 2008. Having the above filtered index will significantly improve its performance (see the following execution plan after converting index to filtered index).
Although, it’s doing non-clustered index scan on this filtered index, but the overall execution time and I/O costs to return the required data is improved with this index.
Column store indexes
A column store index also known as xVelocity indexes is a new type of non-clustered index introduced with the release of SQL Server 2012. Column store indexes are based on VertiPaq in-memory data compression technology, and are considered one of the most significant performance and scalability enhancements in SQL Server 2012. This is due to the fact that unlike traditional tables (heaps) and indexes (B-trees), where data is stored and grouped in a row-based fashion, column store indexes stores data column-wise (This is each column in a separate set of disk pages, rather than storing multiple rows per page, which is the traditional storage format), and then join the columns to complete the index. Moreover, VertiPaq in-memory data compression allows a large amount of data to be compressed in-memory; this result in the query requiring less I/O because the amount of data transferred from disk to memory is significantly reduced.
The column store indexes are useful for data warehousing queries as they speeds up the processing time particularly in situations where very large quantities of data have to be aggregated and accessed quickly.
Column store indexes has some restrictions, such as, column store indexes cannot have more than 1024 column, and cannot include columns with following data types: binary, varbinary, ntext, text, image, varchar(max), nvarchar(max), uniqueidentifier, FILESTREAM, xml, rowversion (and timestamp), sql_variant, decimal (and numeric) with precision greater than 18 digits, datetimeoffset with scale greater than 2 and CLR types (hierarchyid and spatial types) and sparse columns. Moreover, they cannot be set up as a clustered or unique index, and cannot be created over a View or Indexed View.
For more information, see extensive set of documentation at the Microsoft "Columnstore Indexes” web page, and read my article (xVelocity Columnstore Indexes in SQL Server 2012) on SearchSQLServer.com.
A full-text search is a word search based on character string data. The Microsoft Full-Text Engine for SQL Server creates and maintains a full-text catalog automatically when you enable a table for full-text search.
For more information about full-text indexes, see Populate Full-Text Indexes.
An XML index is a persisted representation of the data contained in an XML data type column. XML indexes have different procedures for creation and management than standard indexes, and are structured differently than standard indexes. There are two basic XML index types: primary and secondary. You must create the primary index first and can then create one or more secondary indexes. When creating XML indexes, the base table must have a primary key constraint before you can create XML indexes on the table. If the base table is a partitioned table, XML indexes use the same partitioning function and partitioning schema. Moreover, you can create one primary index and one or more secondary indexes for each XML column in the base table. If you use data type methods, you should create at least a primary index. All data type methods use the primary index for optimization, if present.
Primary XML index
The primary XML index is a balanced tree (B-tree) structure, like that of other SQL Server indexes. It includes tags, values, and node paths discovered in the document when you create the index. When creating the primary XML index, you don’t have to specify the any key information. Prior to creating XML indexes on the table, review the schema of the base table and ensure that the base table must have clustered index on the primary key. SQL Server extracts the XML instance’s as index keys, and it also uses the document order and the path from the root (topmost instance element) to each node as an index key. It is important that SQL Server uses the path information in the index, because many XQuery expressions include the navigational path to an item as part of the expression, including the path in the index makes it faster and easier to locate the route nodes.
You create secondary XML indexes to enhance XML queries search performance. To create one or more secondary indexes, you must first create the primary XML index. The types of secondary indexes you create depend on the types of queries run against the column. SQL Server supports three types of secondary indexes: PATH secondary XML index, PROPERTY secondary XML index, and VALUE secondary XML index. You can create any, all, or none of these secondary index types on an XML instance.
For more information about XML indexes, see XML Indexes (SQL Server).
Continue to Part-4:
In part 4, you will learn about index design and the use of key columns, included columns, and limits on index row size.