Guide to SQL Server Table Indexes – Part 1

Overview

One
of the most important tasks for every database administrator (DBA) is to ensure
that query times are consistent with service-level agreement (SLAs) or within
user expectations. One of the most effective techniques to achieve this
objective is to create indexes on tables for your queries.  SQL Server uses
indexes to sort and organize table data. It creates indexes based on ordering
values from one or more selected columns. SQL Server automatically creates
indexes when you define a primary key or unique constraint. You can use indexes
to manage and track rows in a table or an indexed view.

Indexes
improve the performance on most data scans by reducing the overall time query
takes to run and the amount of work and resources it consumes. The amount of performance
improvement depends on the size of the tables involved, the index design, and
the type of query. You can see the role of indexes by observing what happens
during queries and data manipulation.

In
this part, you will learn about the basics of SQL Server database table’s
indexes, the difference between the clustered and non-clustered indexes, and
how the leaf nodes, non-leaf nodes, and heaps associated with data storage.

How
indexes work?

A
table that has no clustered index is stored in a set of data pages called a heap.
The data is usually stored initially in the order in which the rows are
inserted into the table, but SQL Server Database Engine moves the data around
the heap to store the rows efficiently, so the order of rows cannot be
predicted and are in no particular order, similarly data pages are not sequenced
in a particular order. The only way to guarantee the order of the rows from a
heap is to use ORDER
BY
clause of Transact-SQL.

Access
without an index

When
you access data, SQL Server first determines if there is an index available. If
not, the database engine retrieves data by scanning the entire table. The
database engine begins scanning at the physical beginning of the table, and scans
through the full table page-by-page and row-by-row to look for qualifying data.
Then extracts and returns rows meeting the selection criteria.

Access
with an index

The
process is improved when indexes are present. If there is an appropriate index available,
SQL Server uses it to locate the data. An index improves the search process by
sorting data on key columns. The database engine begins scanning at the first
page of the index, and only scans those pages that potentially contain
qualifying data, based on the index structure and key columns. Finally retrieves
the data rows or pointers that contain the locations of the data rows to allow
direct row retrieval.

The
information SQL Server returns from the table scan, rows or pointers, depends
on whether the database engine scans a clustered or non-clustered index:


Clustered index:
SQL Server physically sorts table rows in index order based on key column
values. The scan returns the actual data rows.


Non-clustered index:
The index doesn’t affect physical sort order. The scan returns pointers to the
data rows.

Indexes Structure

An index is made up of nonleaf nodes
and leaf nodes, and is structured as B-Trees, as shown in figure below:

Description: C:\Users\PMNSER~1\AppData\Local\Temp\SNAGHTML4e1e44f.PNG


Nonleaf
nodes are hierarchical nodes that make up the index sort tree.


Nonleaf
nodes are also referred to as intermediary nodes.


Nonleaf
nodes point to other nonleaf nodes one step down in the hierarchy until reaching
the leaf nodes.


Leaf
nodes are at the bottom of the hierarchy.


In
a clustered index, the leaf nodes are data pages, containing table rows.


In a non-clustered index the leaf
nodes contain pointers to the data rows.

The non-clustered indexes are based
on clustered indexes when present. The leaf node of a non-clustered index points
to leaf node locations in clustered indexes. When you rebuild a clustered
index, the leaf node structure of non-clustered indexes also changes. Similarly
any changes or modification to the clustered index changes the non-clustered
index.

Indexes are most effective when you
need a small percentage of the records stored in the table and there is a high
level of selectivity between the elements (unique or nearly unique values). The
main reasons for the creation of indexes are:


Improve
performance when sorting or grouping data.


Improve
performance on queries that join tables.


Impose
uniqueness on a column or set of columns.

You won’t typically create indexes
on columns that have few unique values or on columns that are seldom referenced
in queries.

Size limits

A table can have:


Zero
or one clustered index.


Up
to 249 non-clustered indexes.

An index is
based on one or more key columns. When you define an index:


You
can specify no more than 16 key columns.


The
index row length (the total of all key columns) cannot exceed 900 bytes.

Variable-length
data types present a possible concern. If the current data in a variable length
data type does not exceed the 900-byte limit, you can create an index with variable-length
data types that have a potential of exceeding this limit. For example, you specify
the following columns:

Column

Maximum

Current maximum in use

nchar(50)

50

50

nchar(100)

100

100

nvarchar(400)

400

400

nvarchar(500)

400

400

You can
create an index based on these key columns, but SQL Server returns a warning that
errors can occur in the future. Consider a situation in which, at some future
time, you attempt to add data that uses the maximum length of the nvarchar
data types. When you do, the operation will fail. This is because SQL Server
cannot create a new row in the index because the total physical length, the
space required to store the data, exceeds the 900-byte limit.

Drawbacks

There are
some drawbacks to imposing indexes on a table:


Creating
indexes takes time and disk space.


Data
update queries (INSERT, UPDATE, or DELETE) must also update table indexes.


Small
tables generally do not receive measurable benefits from being indexed.


Key
columns cannot include bit, text, ntext, or image type data columns.

You use the
INSERT, UPDATE, and DELETE statements to modify table data. When you add data
(INSERT), change data values (UPDATE), or remove rows from a table (DELETE),
SQL Server updates any indexes on that table. The more indexes SQL Server has
to update, the more overhead that is incurred by the process.

Continue to
Part-2:

In the next
part, you will learn about the different types of indexes available in SQL
Server, and what are these indexes advantages and dis-advantages.




Array

One Response to “Guide to SQL Server Table Indexes – Part 1”

  1. Great article Farooq!

    When deciding between types index how do you decide which one to use? I have always thought

    -If the query needs a continuous range or order clause, then use clustered

    -If the query values do not follow an special order, then use non-clustered index.

    Is that a fair approach in your opinion??

    Thanks

    Kash Mughal
    SQL DBA
    http://www.sqlserver2012tutorial.com

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |