When should primary keys be used?
Our department is fairly new to SQL Server. I apologize if these are dumb questions, but I’ve spent hours digging on the Internet and cannot find answers to my questions.
When we create our tables, we create a clustered unique index on the column(s) that best fits the table and the way it will be accessed.
Someone outside our department was reviewing our database design, and pointed out that we have no primary keys defined. It raised a red flag in their mind. They were concerned that not having a primary key would create performance problems and require additional CPU resources.
I have read and read, and all I have found is that when you create the primary key constraint, it requires all rows in the key column(s) to be unique, and forces the creation of a clustered unique index. It appears to me that we have done this process manually by the creation of a clustered unique index, and that adding a primary key constraint would not buy us anything.
So here are my questions:
1) Do we need to create a primary key constraint on a table that already has a clustered unique index? If so, why?
2) If we need a primary key constraint, should we delete the clustered unique index we created, and then add the primary key constraint? My concern is that we will end up with two clustered unique indexes on the same column(s) that will add to database load times, etc.
3) Is there an advantage to using the primary key constraint to automatically create the clustered unique index as opposed to just outright creating the clustered unique index if they are on the same column.
Let’s take a look at each of these three questions.
First Question: Technically speaking, a primary key is not required on a table for any reason. A primary key serve two purposes. It acts to enforce entity integrity of the table. What this means is that it ensure that there are no duplicate records in a table. Next, a primary key can be used along with a foreign key to ensure that referential integrity is maintained between tables. Because of these two reasons, it is generally recommended that all tables have a primary key.
By itself, a primary key does not have a direct affect on performance. But indirectly, it does. This is because when you add a primary key to a table, SQL Server creates a unique index (clustered by default) that is used to enforce entity integrity. But as you have already discovered, you can create your own unique indexes on a table. So, strictly speaking, a primary index does not affect performance, but the index used by the primary key does.
Now to directly answer your question. You don’t have to have primary keys on your tables if you don’t care about the benefits that arise from using them. If you like, you can keep the current indexes you have, and assuming they are good choices for the types of queries you will be running against the table, then performance will be enhanced by having them. Replacing your current indexes, and replacing them with primary keys will not help performance.
Second Question: I think I answered most of this question above. But I do want to point out that you cannot have two clustered indexes on the same table. So if you did want to add primary keys to the tables you currently have, you can, using one of two techniques. First, you can choose to add a primary key using a non-clustered index, or second, you can drop your current indexes, and then add a primary key using a clustered index.
Third Question: Here’s what I do. I assign primary keys to every table because this is a best database design practice. But before I do, I evaluate whether or not the primary key’s index should be clustered or non-clustered, and then choose accordingly. Since you can only have one clustered index, it should be chosen well. See my clustered tips webpage for more details. Next, I then evaluate the table for additional indexes that may be needed, and proceed accordingly.
It is not a good idea, from a performance perspective, to accept the default of a clustered index on a primary key, as it may not be the best choice for the use of a clustered index. In addition, it is not a good idea to “double up” on indexes. In other words, don’t put a Primary key non-clustered index on a column, and a clustered index on the same column (this is possible, although never a good idea).
Always think about indexes and why they exist. Only add indexes where they are needed, and nowhere else. Too many indexes can be as bad for performance as too few indexes.