Large Data Operations in SQL Server – Page 5

The net result of the execution plan cost formulas is that the default plan using an Index Seek is less expensive than the Table Scan plan by a factor of 238. This is entirely due to the difference in cost between an Index Seek for the 100,000 rows in a high-density index and a Table Scan of 101,012 pages and 10M rows. The table below shows the actual Update and the subsequent checkpoint time for 256M and 1154M configurations.

UPDATE query – 100K rows	Sequential rows	Sequential rows	Distributed rows	Distributed rows
256M server mem	Index	Table Scan	Index	Table Scan
Query time (sec)	1.3	12.6	476.6	28
Checkpoint time (sec)	0.4	0.6	14.5	8
Rows /sec	57,471	7,576	203	2,778
1154MB server mem
Query time (sec)	0.8	1.3	0.9	1.5
Checkpoint time (sec)	0.2	0.1	23	23
Rows /sec	100,000	71,429	4,184	4,082

The index plan is indeed faster when all modified rows are in adjacent pages, but is much slower if the modified rows are distributed across many pages. Since SQL Server statistics do not determine the distribution of the affected pages, the execution plan cost formula should not assume that rows are in adjacent pages. There is no expectation for rows from a common nonclustered index value to be stored in adjacent rows. SQL Server however, always chose the index plan if an index on the SARG exists because the execution plan does not take into account the cost of retrieving the data pages in a manner similar to the Bookmark Lookup operation. It is clear that the true cost does reflect the cost of retrieving the actual data row. This discrepancy could lead to a very slow plan compared to the table scan method when no indexes are present.

A simple change of the UPDATE query in the SET condition to randMoney = randMoney + 1.0 yields the plan show below. Now the default plan uses a table scan, and forcing the index on the SARG shows a plan with the bookmark lookup operation.

Delete Queries

The delete queries are shown below. The first two queries involve rows in sequential pages, and the second pair involves rows distributed across all pages as in the other examples.

— Sequential rows, index seek

DELETE M3C_01 WHERE SeqID = 91

CHECKPOINT

— Sequential rows, table scan

DELETE m FROM M3C_01 m WITH(INDEX(0)) WHERE SeqID = 71

CHECKPOINT

— Distributed rows, index seek

DELETE M3C_01 WHERE DistID = 91

CHECKPOINT

— Distributed rows, table scan

DELETE m FROM M3C_01 m WITH(INDEX(0)) WHERE DistID = 27

CHECKPOINT

Without hints, the execution plan uses an Index Seek. The I/O cost for reading the actual data page and row is not taken into account by the execution plan, similar to the Update statement. Forcing a table scan shows a plan that is much more expensive. After the Table Delete, there is Spool operation followed by Index Delete operations for the two indexes on this table.

The Index Seek and Table Scan operations are the same as the previous Select and Update queries. The Top operation in each case of the two above Delete statements are also the same as the corresponding Update statements. Unlike the Update, the Delete has no Compute Scalar operation. However, the Index Seek has cost of 0.250619, the Top add 0.010000 for a total of 0.260619, the Table Delete adds 0.110068, yet the Table Delete subtree cost is 0.380687, so somehow an additional amount of 0.010000 crept into the Table Delete subtree cost. The Table Delete operation has a fixed I/O cost 0.010068 irrespective of the number of rows and pages involved.

Continues…

Delete Queries

Leave a comment Cancel reply