SQL Server Performance Forum – Threads Archive
PK (Clustered Index) on Identity ColumnPerhaps someone could clarify something for me as I have read two statements on the site that contradict each other. Within the article on performance tuning at http://www.sql-server-performance.com/developers_tuning_tutorial.asp the following statement is made under Tips for Selecting a Clustered Index: Avoid putting a clustered index on columns that increment, such as an identity, date, or similarly incrementing columns, if your table is subject to a high level of INSERTS. Since clustered indexes force the data to be physically ordered, a clustered index on an incrementing column forces new data to be inserted at the same page in the table, creating a table hot spot, which can create disk I/O bottlenecks. Ideally, find another column or columns to become your clustered index. In the following form post http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=3 … an identify value often makes for a good column on which to base a primary key. I tend to fall into the second camp and often create the identity column as a primary key, thereby creating a clustered index on the UID column on most of my tables. As I understand it, an auto incrementing primary key does not result in page splits because as new pages are created and new rows INSERTed, the new row is added as the first row in the new page. I don’t understand the author’s comment in the first article regarding table hotspots. Any explanation/clarification would be very much appreciated.
I cant provide any references at the moment but I read somewhere that the disk hotspot problem was evident in 6.5, but by 7.0 it was no longer an issue.
Sorry for the confusion, but you are correct. In most cases in 7.0 and 2000, hot spots only occur if you have something like 300-400 inserts a second. So unless you have a very busy database, adding a clustered index to a identity column will generally offer the best performance. Of course, there are exceptions, but without a good reason otherwise, this is the best approach. But if you are using 6.5, this is not a good idea, and clustered indexed should be avoided for identity columns.
Brad M. McGehee