Site sponsored by: Idera The gold standard of SQL Server performance monitoring & diagnostics.
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Performance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
Developer
General DBA
ASP.NET / ADO.NET

Write for Us

Share your SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

System Data Collection Reports
Recover Data Using Database Snapshots
Analyze and Fix Index Fragmentation in SQL Server 2008
Powerful Geographical Visualisations made easy with SQL 2008 Spatial (Part 2) ...

More     
 
Latest FAQ's

How to alter a User Defined Data Type?
How to unzip a File in SSIS?
How to view previous query plans?
ALTER TABLE SWITCH statement failed because the object '%.*ls' is not ...

More     
   
Latest Software Reviews

Spotlight on ApexSQL Doc 2008
ApexSQL Enforce
Embarcadero Change Manager
SQL Server DBA Dashboard

More     

articles >> performance tuning >> Large Data Operations in SQL Server ...

Large Data Operations in SQL Server

By : Joe Chang
Feb 27, 2005

Page 3 / 7



Test Queries

The first queries tested are the Select statements are shown below. The first pair of Select statement queries are on the SeqID column. The second pair of Select statement queries are on the DistID columns. The first query in each pair is a basic Select statement without index hints. The second query in each pair specified as an index hint.

-- Sequential rows, table scan

SELECT
AVG(randMoney) FROM
M3C_01 WHERE SeqID = 91

-- Sequential rows, index seek and bookmark
lookup

SELECT
AVG(randMoney) FROM
M3C_01 WITH(INDEX(IX_M3C_01_Seq))
WHERE SeqID = 91

-- Distributed rows, table scan

SELECT
AVG(randMoney) FROM
M3C_01 WHERE DistID = 91

-- Distributed rows, index seek and bookmark
lookup

SELECT
AVG(randMoney) FROM
M3C_01 WITH(INDEX(IX_M3C_01_Disr))
WHERE DistID = 91

The estimated and actual row count for both search arguments (SARG) is 100,000 rows. The default execution plan for the query without hints for either SARG is a table scan. The hint forces the execution plan to use the index on the SARG, which in turn requires a bookmark lookup to retrieve the randMoney column value. Now SQL Server statistics only keeps distribution information on the range of column values, not the location of the rows. Therefore, SQL Server cannot know how many different pages are actually required. Since both SARG cases have the same estimated (and actual) row count, the execution plan and estimated cost is identical for either SARG as long as the hints are also equivalent. The execution plans for the first pair of queries are shown below.

The Table Scan involves 101,012 pages and the Bookmark Lookup involves 100,000 rows, so the actual query time is essentially a comparison of the cost per page of a table scan relative to the bookmark lookup cost per row. The Table Scan cost detail is shown below.

The Estimated I/O cost is displayed as 37.4 and the Estimated CPU cost is displayed as 5.50, but the Estimated cost for the entire operation is 85.86. Calculating the I/O and CPU costs from the formula in the previous section, one arrives at 74.86 for I/O and 11.00 for CPU. This does work out to exactly the 85.86 shown for the combined I/O and CPU cost. Every now and then, the displayed value of the I/O and CPU costs are exactly one-half the expected value, but the combined cost is exactly the expected cost.

The index seek cost detail is shown below. The I/O cost of 0.140240 works out to the base cost from the previous section plus the cost for 185 additional leaf-level pages, meaning that each index leaf-level page holds approximately 541 rows. The CPU cost 0.110000 corresponds to 100,000 rows for a total I/O and CPU cost of 0.250619.

The bookmark lookup cost detail is shown below. The I/O cost of 311.86 works out 99.8% the cost of the 100,000 times the single row Bookmark Lookup I/O cost of0.0031249.

Rather than change system memory or the size of test data, various settings for max server memory were used. The run times measured from Profiler at 256M and 1,154M max server memory are shown next. Query Analyzer STATISTICS IO and the Perfmon disk counters confirm that the entire table is read from disk for the 256M server memory configuration, with the exception of the Index Seek and Bookmark Lookup plan on SeqID, which required very few disk reads. The 1,154M test runs are measured when both data and indexes are in memory.

SELECT query
100K rows

Sequential rows

Sequential rows

Distributed rows

Distributed rows

256M server mem

Index + BL

Table Scan

Index+BL

Table Scan

Query time (sec)

0.3

10.5

167

10.5

Rows or Pages/sec

333,333(R)

9,620(P)

599(R)

9,620(P)

Disk IO/sec

Low

~1,200

~600

~1,200

Avg. Byte/Read

 

64K

8K

64K

1154MB server mem

 

 

 

 

Query time

0.266

1.076

0.373

1.090

Rows or Pages/sec

376,000

93,877

268,000

92,672

Both Table Scans ran in approximately 10.5 seconds, so the location of the required rows does not affect the table scan time. The disk I/O rate for each physical disk was approximately 600/sec at 64KB/Read for a transfer rate of 37.5MB/sec, about the maximum throughput for the Seagate ST318451 disk (current generation 15K drives can sustain >50MB/sec sequential transfer capability).

The index seek plan for sequential rows exhibited some initial disk activity (less than 200 disk I/Os) at the 256M server memory configuration, but thereafter ran in memory. The index seek plan for distributed rows was forced to retrieve each 8KB page of the 101,012 page table from disk when max server memory was restricted to 256M. Each disk averaged 300 IO/sec at 8KB/Read. It might seem that 300 IO/sec is somewhat high for a 15K disk. However, the average queue depth was 40 per disk, and the data was distributed over 0.8GB of the 32GB total combined disk space. The high queue depth allows the disk drive to re-sequence I/O order and the small space usage reduces the average seek travel distance, both of which improves I/O performance.


<< Prev Page     Next Page>>    








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 1999-2008 by T10 Media. All rights reserved