Site sponsored by: Idera Try Idera’s new SQL admin toolset
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


FAQ Topics

All FAQ's
General DBA
General Developer
DBA Performance Tuning
Developer Performance Tuning
Clustering
Error Messages

Write for Us

Share you SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

Policy Based Management in SQL Server 2008
Inside SQL Server Cluster Setup and Troubleshooting Techniques - Part I ...
Configure and Manage Policy Based Management in SQL Server 2008 ...
Using Column Sets with Sparse Columns

More     
 
Latest FAQ's

Cannot Start SQL Server Service
Users are able to connect to report manager but not able ...
Errors when SQL Server Snapshot Replication is Running
How to Display Server Name or IP Address in a Reporting ...

More     
   
Latest Software Reviews

Spotlight on ApexSQL Doc 2008
ApexSQL Enforce
Embarcadero Change Manager
SQL Server DBA Dashboard

More     

I want to add an identity or unique identifier column to a table in order to ensure that each row is unique. For best performance, which one should I use?



One of the key aspects of database table design is to ensure what is called entity integrity. What this means is that you need to ensure that each row in your database tables is unique. If you don't take the proper precautions, it is possible that one or more rows of your table might be the same. If this happens, what does this really mean (duplicate records)?

One of the more common ways to guarantee entity integrity is to create a primary key. A primary key, which is an index created on one or more columns, is used to guarantee that no duplicate records are entered into a table. If you try to enter a duplicate record, SQL Server won't let you and will give you an error message.

Some tables lend themselves to uniqueness and it is easy to find a column that you know will always be unique, and adding a primary key to the column is no problem. But in other cases, there is no single column that is unique. Instead, it may take two, three, four, or more columns to uniquely identify a column. If you want, you can create a primary key on multiple columns.

Unfortunately, creating a primary key on multiple columns has a major drawback. And that is the index that is created to enforce each row's uniqueness may get very large. The wider the index, the larger the physical index will be. This can hurt performance because it takes SQL Server more resources to maintain or to query wider indexes than it takes it to maintain or to query narrow indexes.

So is there any way to avoid creating primary keys with wide indexes? Yes, and that brings us to the point of our question. Instead of creating a primary key with a wide index, what you can do is to add either an identity or a unique identifier column to the table, and use this column as the primary key. This will greatly reduce the width of the index created, reducing the physical size of the index, and speeding up SQL Server's access to the table.

What an identify field or a unique identify column does is to automatically create a unique value for each row created in your database. You don't have to worry about creating these values for yourself, this is done automatically by SQL Server. But what you must decide is whether to use an identify field or a unique identity column. Ideally, from a performance perspective, you want to use the one that creates the smallest possible physical index in order to maximize performance. So which one of these types of columns should you use?

Let's first look at identify values. An identify value is an automatically incrementing integer that starts counting from a base seed value (such as 1, or 100, or 1000, whatever integer you choose), and increments by a value you specify, which is called an increment (such as 1, -1, 10, whatever integer you choose). The default seed is 1, and the default increment is 1. Because the values are never the same (they increment), an identify value often makes for a good column on which to base a primary key.

When you create an identity column in a table, you must also specify the data type as some form of integer. As you probably know, SQL Server supports several different integer types, each with different sizes (widths). They include:

Bigint 8 bytes -263 to 263-1 (SQL Server 2000 and 2005 only)
Int 4 bytes -2,147,483,648 to 2,147,483,647
Smallint 2 bytes -32,768 to 32,767
Tinyint 1 byte 0 to 255

When creating an identify column, always select the narrowest integer type that will meet your needs. If an Int will do, then don't use Bigint, as you will just unnecessary be increasing the physical size of the index, which can hurt SQL Server's performance.

A unique identifier column, like an identify column, is used to create unique values for each column in your table. But instead of using an incrementing value, instead it creates a unique value based on a special internal algorithm that creates what are called a Globally Unique Identifier (GUID), which is 16 bytes in size.

Right away, you should notice which unique column type — identify or unique identifier — will provide the best performance as a primary key. Because the unique identifier column will always be 16 bytes wide, it will always create a primary key index that is larger than any of the identity column options, and unless you have a special reason that is outside the scope of this question, should not be used as the basis as a primary key.

Instead, use one of the variations of the identity column, selecting the smallest one that meets your data's needs. This way, you can ensure that index created by the primary key is as small as it can, helping to ensure overall better performance of your database, and at the same time ensuring entity integrity of your database tables.








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 1999-2008 by T10 Media. All rights reserved