Site sponsored by: Idera Try Idera’s new SQL admin toolset
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Performance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
Developer
General DBA
ASP.NET / ADO.NET

Write for Us

Share you SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

Capture DDL Changes using Change Data Capture with SQL Server 2008 ...
Business Intelligence in Collaborative Planning, Forecasting and Replenishment
Inside SQL Server Cluster Setup and Troubleshooting Techniques - Part I ...
Configure and Manage Policy Based Management in SQL Server 2008 ...

More     
 
Latest FAQ's

Cannot Start SQL Server Service
Users are able to connect to report manager but not able ...
Errors when SQL Server Snapshot Replication is Running
How to Display Server Name or IP Address in a Reporting ...

More     
   
Latest Software Reviews

Spotlight on ApexSQL Doc 2008
ApexSQL Enforce
Embarcadero Change Manager
SQL Server DBA Dashboard

More     

articles >> performance tuning >> Large Data Operations in SQL Server ...

Large Data Operations in SQL Server

By : Joe Chang
Feb 27, 2005

Page 2 / 7



Test Environment

The test environment server is a 2-way Xeon 2.4GHz/512K cache, 533MHz system bus, and 2GB memory on a motherboard with the ServerWorks GC-LE chipset. The test database data file resides on 2x15K 18G SCSI U160 drives (OS) striped, all log files on one 10K SCSI drive, and the tempdb data file on an IDE/ATA drive. The database recovery model is set to SIMPLE, and 4GB of space is pre-allocated for both data and log. In addition, the max degree of parallelism was restricted to one CPU for most tests. The results are single query time measurements. There are some variations between measurements since there is no easy method to determine the contents of the buffer cache. An attempt was made to determine a reasonably repeatable measurement.

The following script creates a test table and populates the table with test data. As defined, this table has 8 integer columns of 4 bytes each, a char column of 10 bytes, a decimal column of 5 bytes, and three other columns of 8 bytes each for a total 71 bytes. Along with the metadata overhead, this works out to approximately 81 bytes per row, allowing 99 rows to fit in each 8KB page at 99% fill factor, with 77 bytes free.

CREATE TABLE M3C_00 (
     ID int NOT NULL,
     ID2 int NOT NULL,
     ID3 int NOT NULL,
     ID4 int NOT NULL,
     ID5 int NOT NULL,
     ID6 int NOT NULL,
     SeqID int NOT NULL,
     DistID int NOT NULL,
     Value char(10) NOT NULL,
     randDecimal decimal (9,4) NOT NULL,
     randMoney money NOT NULL,
     randDate datetime NOT NULL,
     seqDate datetime NOT NULL )

The script below populates the above table with 10,000,000 rows. The ID column is a sequence from 1 to 10M, but is not explicitly defined as an identity column. The columns ID2 to ID6 are not important to this series of tests, and any values can be used. The SeqID values are sequential and DistID values are distributed. In this case, the table has 10M rows. The first 100,000 rows have SeqID value 1; the next 100K rows have SeqID value 2 and so on. The first row has DistID value 1, the next row with DistID value is row 101, and so on. All rows with a specific SeqID are in adjacent rows. All rows with a specific DistID value are 100 rows apart, which cause each row having a common DistID value to be in a different 8KB data page. This table is not clustered, so while there is no guarantee on the row placement, it does appear that the rows are inserted in sequence.

BEGIN TRANSACTION

DECLARE @I int, @rowCnt int,
@p int, @sc1 int, @dv1 int

SELECT @I = 1, @rowCnt =
10000000, @p =100, @sc1 = 100000

SELECT @dv1 = @rowCnt/100000

WHILE @I <= @RowCnt BEGIN

INSERT M3C_00
(ID, ID2, ID3, ID4, ID5, ID6, SeqID, DistID,

Value, randDecimal, randMoney,
randDate, seqDate)

VALUES ( @I,
@I, @I/2, @I/4, @I/10, @I/20,

(@I-1)/@sc1 + 1, -- Sequential values

(@I-1)%(@dv1) + 1, -- Distributed values

>CHAR(65 + 26*rand())+CHAR(65 +
26*rand())+CHAR(65 + 26*rand())

+CONVERT(char(6),CONVERT(int,100000*(9.0*rand()+1.0)))+CHAR(65 + 26*rand()),

10000*rand(), 10000*rand(),

DATEADD(hour,100000*rand(),'1990-01-01'),
DATEADD(hour,@I/5,'1990-01-01') )

SET @I = @I+1

END

COMMIT TRANSACTION

CHECKPOINT

The above script takes 17min 45sec to run, averaging 9,380 row inserts per second. The M3C_00 table ends up at 105,264 pages with 95% fillfactor. Instead of regenerating the table each time a fresh data set is needed, the original data set is preserved in the above table and a copy of the original is generated with the SELECT INTO statement below.

SELECT
ID,ID2,ID3,ID4,ID5,ID6,SeqID,DistID,Value,randDecimal,randMoney,randDate,seqDate

INTO M3C_01

FROM M3C_00

CHECKPOINT

This table has 101,012 pages, averaging 99 rows per page at 99% fillfactor for a table size of just over 789MB. The SELECT INTO takes approximately 28 seconds to run, averaging 357,000 rows inserted per second. The following two indexes are created on SeqID and DistID columns. Each index occupies 18,554 pages or 145MB.

CREATE INDEX IX_M3C_01_Seq
ON M3C_01 (SeqID) WITH SORT_IN_TEMPDB

CHECKPOINT

CREATE INDEX IX_M3C_01_Dist
ON M3C_01 (DistID) WITH SORT_IN_TEMPDB

CHECKPOINT

The first index takes approximately 30 seconds to create with 256M memory and the seconds takes 60 seconds.


<< Prev Page     Next Page>>    








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 1999-2008 by T10 Media. All rights reserved