Clustered Indexes, Pages and Sort | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Clustered Indexes, Pages and Sort

Hi,<br /><br /> a few days ago I learnt that the data of a table with a clustered index are not sorted physically in the disk (I thought it was[<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ />]). A read a post here and some links that you guys posted in other topics, and I learnt that the only thing that is really sorted is the b-tree that holds the cluster index, and because it has a double linked list, it can be sorted very fast. It seems fine to me, but yesterday a read that the only thing that is sorted is the index rows in every page, but not the all the rows in all the pages together (my poor english vocabulary don’t let me explain it like I want). I understood that if the index consists in 100 pages, each page is sorted individually (the n rows on it), without sorting the 100*n rows together.<br /><br />If that’s true, I have this doubt. If I have the ideal case of 2 pages of index rows sorted in this way:<br /><br />Page 1<br />Row 1.1<br />Row 1.2<br />Row 1.3<br />Row 1.4<br /><br />Page 2<br />Row 2.1<br />Row 2.2<br />Row 2.3<br />Row 2.4<br />Row 2.5<br /><br />What happen if a change the clustered index with another sort, and the new sorted data is<br /><br />Row 2.2<br />Row 1.2<br />Row 1.3<br />Row 2.5<br />Row 2.3<br />Row 2.1<br />Row 1.1<br />Row 2.4<br />Row 1.4<br /><br />If only the data in the pages are sorted, i’d get something like <br /><br />Page 1<br />Row 1.2<br />Row 1.3<br />Row 1.1<br />Row 1.4<br /><br />Page 2<br />Row 2.2<br />Row 2.5<br />Row 2.3<br />Row 2.1<br />Row 2.4<br /><br />Now, this makes me wonder how can the sqlserver really know the real order of the data?<br /><br />Is this ok, or I’m completely lost …..?[8D]<br /><br />Thanks,<br /><br />Patrick MacKAY<br />ASP.NET MVP
Not sure about the SQL internal issues, but the actual order "on disk" is considered non-critical – and it should be, except when index fragmentation becomes noticeable. SQL doesn’t know the real order, it just reads the data from the indexes and the table. If no ORDER BY is supplied, then I guess the data is returned ordered rougly by the primary key (if present) but with the distinct possibility that index fragmentation messes it up.
Patrick,
a clustered index does not sort the data physically on the disk according to the clustered index keys. That would be way to expensive. What SQL Server guarantees to maintain is the page chain. A doubly linked list that points to the previous and the next page in order. Since this list can only be sorted in one way, you can have only one clustered index on every table. The leaf level of a clustered index is the data or if you want the table itself. So when using a clustered index you already have the data. No need to worry that parts of a table would get out of order. But as Adriaan already mentioned, the storage engine is kinda independant from the relational engine. The only reliable way to get a sorted resultset in the way you expect it, is the use of ORDER BY with your SELECT statement. While a SELECT * FROM table usually produces a resultset that seems to be in the order of a clustered index, there is no guarantee. Run such a query on a multiprocessor machines and it’s likely to deliver a "wrong" sorted set. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Ich unterstütze PASS Deutschland e.V. http://www.sqlpass.de)

mmm … mmm … mmm … thank you guys, I understand your points, and maybe it’s better to rely on sql server and not to worry about it, but I can’t. Do you know a good book where I can find what I’m looking for?. Does the Delaney’s book cover this subject? I have more questions about pages, extents, rows, etc, and I need to get clear… that’s the way it is. I need to know. Thanks again. Patrick MacKAY
ASP.NET MVP
Delany is the book.
Luis Martin
Moderator
SQL-Server-Performance.com One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important
Bertrand Russell
All postings are provided “AS IS” with no warranties for accuracy.
Well, thank you all. Patrick MacKAY
ASP.NET MVP
You will come as close to physical order as possible after rebuilding clustered index. However, even when data are phyisicaly stored in desired order, if you don’t specify order by clause many different alghorithms may be aplied (e.g. merge, hash joins) so order is not guaranteed except when order by clause is specified.
The book Luis mentioned is "Inside Microsoft SQL Server 2000" by Kalen Delaney.
"Inside SQL Server 2000" *is* the technical reference to SQL Server. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Ich unterstütze PASS Deutschland e.V. http://www.sqlpass.de)

]]>