Guide to SQL Server Table Indexes – Part 1

Overview

One of the most important tasks for every database administrator (DBA) is to ensure that query times are consistent with service-level agreement (SLAs) or within user expectations. One of the most effective techniques to achieve this objective is to create indexes on tables for your queries.  SQL Server uses indexes to sort and organize table data. It creates indexes based on ordering values from one or more selected columns. SQL Server automatically creates indexes when you define a primary key or unique constraint. You can use indexes to manage and track rows in a table or an indexed view.

Indexes improve the performance on most data scans by reducing the overall time query takes to run and the amount of work and resources it consumes. The amount of performance improvement depends on the size of the tables involved, the index design, and the type of query. You can see the role of indexes by observing what happens during queries and data manipulation.

In this part, you will learn about the basics of SQL Server database table’s indexes, the difference between the clustered and non-clustered indexes, and how the leaf nodes, non-leaf nodes, and heaps associated with data storage.

How indexes work?

A table that has no clustered index is stored in a set of data pages called a heap. The data is usually stored initially in the order in which the rows are inserted into the table, but SQL Server Database Engine moves the data around the heap to store the rows efficiently, so the order of rows cannot be predicted and are in no particular order, similarly data pages are not sequenced in a particular order. The only way to guarantee the order of the rows from a heap is to use ORDER BY clause of Transact-SQL.

Access without an index

When you access data, SQL Server first determines if there is an index available. If not, the database engine retrieves data by scanning the entire table. The database engine begins scanning at the physical beginning of the table, and scans through the full table page-by-page and row-by-row to look for qualifying data. Then extracts and returns rows meeting the selection criteria.

Access with an index

The process is improved when indexes are present. If there is an appropriate index available, SQL Server uses it to locate the data. An index improves the search process by sorting data on key columns. The database engine begins scanning at the first page of the index, and only scans those pages that potentially contain qualifying data, based on the index structure and key columns. Finally retrieves the data rows or pointers that contain the locations of the data rows to allow direct row retrieval.

The information SQL Server returns from the table scan, rows or pointers, depends on whether the database engine scans a clustered or non-clustered index:

Clustered index: SQL Server physically sorts table rows in index order based on key column values. The scan returns the actual data rows.

Non-clustered index: The index doesn’t affect physical sort order. The scan returns pointers to the data rows.

Indexes Structure

An index is made up of nonleaf nodes and leaf nodes, and is structured as B-Trees, as shown in figure below:

Description: C:\Users\PMNSER~1\AppData\Local\Temp\SNAGHTML4e1e44f.PNG

• Nonleaf nodes are hierarchical nodes that make up the index sort tree.

• Nonleaf nodes are also referred to as intermediary nodes.

• Nonleaf nodes point to other nonleaf nodes one step down in the hierarchy until reaching the leaf nodes.

• Leaf nodes are at the bottom of the hierarchy.

• In a clustered index, the leaf nodes are data pages, containing table rows.

• In a non-clustered index the leaf nodes contain pointers to the data rows.

The non-clustered indexes are based on clustered indexes when present. The leaf node of a non-clustered index points to leaf node locations in clustered indexes. When you rebuild a clustered index, the leaf node structure of non-clustered indexes also changes. Similarly any changes or modification to the clustered index changes the non-clustered index.

Indexes are most effective when you need a small percentage of the records stored in the table and there is a high level of selectivity between the elements (unique or nearly unique values). The main reasons for the creation of indexes are:

• Improve performance when sorting or grouping data.

• Improve performance on queries that join tables.

• Impose uniqueness on a column or set of columns.

You won’t typically create indexes on columns that have few unique values or on columns that are seldom referenced in queries.

Size limits

A table can have:

• Zero or one clustered index.

• Up to 249 non-clustered indexes.

An index is based on one or more key columns. When you define an index:

• You can specify no more than 16 key columns.

• The index row length (the total of all key columns) cannot exceed 900 bytes.

Variable-length data types present a possible concern. If the current data in a variable length data type does not exceed the 900-byte limit, you can create an index with variable-length data types that have a potential of exceeding this limit. For example, you specify the following columns:

Column

Maximum

Current maximum in use

nchar(50)

50

50

nchar(100)

100

100

nvarchar(400)

400

400

nvarchar(500)

400

400

You can create an index based on these key columns, but SQL Server returns a warning that errors can occur in the future. Consider a situation in which, at some future time, you attempt to add data that uses the maximum length of the nvarchar data types. When you do, the operation will fail. This is because SQL Server cannot create a new row in the index because the total physical length, the space required to store the data, exceeds the 900-byte limit.

Drawbacks

There are some drawbacks to imposing indexes on a table:

• Creating indexes takes time and disk space.

• Data update queries (INSERT, UPDATE, or DELETE) must also update table indexes.

• Small tables generally do not receive measurable benefits from being indexed.

• Key columns cannot include bit, text, ntext, or image type data columns.

You use the INSERT, UPDATE, and DELETE statements to modify table data. When you add data (INSERT), change data values (UPDATE), or remove rows from a table (DELETE), SQL Server updates any indexes on that table. The more indexes SQL Server has to update, the more overhead that is incurred by the process.

Continue to Part-2:

In the next part, you will learn about the different types of indexes available in SQL Server, and what are these indexes advantages and dis-advantages.

]]>

Leave a comment

Your email address will not be published.