SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Training
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Performance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
SQL Azure
Developer
General DBA
ASP.NET / ADO.NET
SQL Azure

USEFUL SITES :

ASP.NET Tutorials
Windows and SQL Azure Tutorials
Cloud Hosting Magazine
SharePoint Tutorials
Windows Server Help

Write for Us

Share your SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

A High Level Comparison Between Oracle and SQL Server - Part ...
A High Level Comparison Between Oracle and SQL Server - Part ...
A High Level Comparison Between Oracle and SQL Server - Part ...
A High Level Comparison Between Oracle and SQL Server

More     
 
Latest FAQ's

Add Node to A SQL Server failover Cluster failed with invalid ...
SQL Server Destination remote server error
Setting Up Data And Log Files For SQL Server
Will Check Constraints Improve Database Performance?

More     
   
Latest Software Reviews

dbForge Review
Spotlight on ApexSQL Diff - Server-based database comparison tool ...
Spotlight on ApexSQL Data Diff - Server-based database comparison tool ...
Spotlight on ApexSQL Doc 2008

More     

articles >> performance tuning >> Large Data Operations in SQL Server ...

Large Data Operations in SQL Server

By : Joe Chang
Feb 27, 2005

Page 2 / 7



Test Environment

The test environment server is a 2-way Xeon 2.4GHz/512K cache, 533MHz system bus, and 2GB memory on a motherboard with the ServerWorks GC-LE chipset. The test database data file resides on 2x15K 18G SCSI U160 drives (OS) striped, all log files on one 10K SCSI drive, and the tempdb data file on an IDE/ATA drive. The database recovery model is set to SIMPLE, and 4GB of space is pre-allocated for both data and log. In addition, the max degree of parallelism was restricted to one CPU for most tests. The results are single query time measurements. There are some variations between measurements since there is no easy method to determine the contents of the buffer cache. An attempt was made to determine a reasonably repeatable measurement.

The following script creates a test table and populates the table with test data. As defined, this table has 8 integer columns of 4 bytes each, a char column of 10 bytes, a decimal column of 5 bytes, and three other columns of 8 bytes each for a total 71 bytes. Along with the metadata overhead, this works out to approximately 81 bytes per row, allowing 99 rows to fit in each 8KB page at 99% fill factor, with 77 bytes free.

CREATE TABLE M3C_00 (
     ID int NOT NULL,
     ID2 int NOT NULL,
     ID3 int NOT NULL,
     ID4 int NOT NULL,
     ID5 int NOT NULL,
     ID6 int NOT NULL,
     SeqID int NOT NULL,
     DistID int NOT NULL,
     Value char(10) NOT NULL,
     randDecimal decimal (9,4) NOT NULL,
     randMoney money NOT NULL,
     randDate datetime NOT NULL,
     seqDate datetime NOT NULL )

The script below populates the above table with 10,000,000 rows. The ID column is a sequence from 1 to 10M, but is not explicitly defined as an identity column. The columns ID2 to ID6 are not important to this series of tests, and any values can be used. The SeqID values are sequential and DistID values are distributed. In this case, the table has 10M rows. The first 100,000 rows have SeqID value 1; the next 100K rows have SeqID value 2 and so on. The first row has DistID value 1, the next row with DistID value is row 101, and so on. All rows with a specific SeqID are in adjacent rows. All rows with a specific DistID value are 100 rows apart, which cause each row having a common DistID value to be in a different 8KB data page. This table is not clustered, so while there is no guarantee on the row placement, it does appear that the rows are inserted in sequence.

BEGIN TRANSACTION

DECLARE @I int, @rowCnt int,
@p int, @sc1 int, @dv1 int

SELECT @I = 1, @rowCnt =
10000000, @p =100, @sc1 = 100000

SELECT @dv1 = @rowCnt/100000

WHILE @I <= @RowCnt BEGIN

INSERT M3C_00
(ID, ID2, ID3, ID4, ID5, ID6, SeqID, DistID,

Value, randDecimal, randMoney,
randDate, seqDate)

VALUES ( @I,
@I, @I/2, @I/4, @I/10, @I/20,

(@I-1)/@sc1 + 1, -- Sequential values

(@I-1)%(@dv1) + 1, -- Distributed values

>CHAR(65 + 26*rand())+CHAR(65 +
26*rand())+CHAR(65 + 26*rand())

+CONVERT(char(6),CONVERT(int,100000*(9.0*rand()+1.0)))+CHAR(65 + 26*rand()),

10000*rand(), 10000*rand(),

DATEADD(hour,100000*rand(),'1990-01-01'),
DATEADD(hour,@I/5,'1990-01-01') )

SET @I = @I+1

END

COMMIT TRANSACTION

CHECKPOINT

The above script takes 17min 45sec to run, averaging 9,380 row inserts per second. The M3C_00 table ends up at 105,264 pages with 95% fillfactor. Instead of regenerating the table each time a fresh data set is needed, the original data set is preserved in the above table and a copy of the original is generated with the SELECT INTO statement below.

SELECT
ID,ID2,ID3,ID4,ID5,ID6,SeqID,DistID,Value,randDecimal,randMoney,randDate,seqDate

INTO M3C_01

FROM M3C_00

CHECKPOINT

This table has 101,012 pages, averaging 99 rows per page at 99% fillfactor for a table size of just over 789MB. The SELECT INTO takes approximately 28 seconds to run, averaging 357,000 rows inserted per second. The following two indexes are created on SeqID and DistID columns. Each index occupies 18,554 pages or 145MB.

CREATE INDEX IX_M3C_01_Seq
ON M3C_01 (SeqID) WITH SORT_IN_TEMPDB

CHECKPOINT

CREATE INDEX IX_M3C_01_Dist
ON M3C_01 (DistID) WITH SORT_IN_TEMPDB

CHECKPOINT

The first index takes approximately 30 seconds to create with 256M memory and the seconds takes 60 seconds.


<< Prev Page     Next Page>>    








C# Help and Tutorials | PHP MySQL Tutorial | Sharepoint Tutorial | Azure Tutorial | Cloud Hosting Magazine | ASP.NET Tutorials | Windows Server Help | Windows Phone Pro | Silverlight Ace | Visual Studio Tutorials | Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | SQL Server Training Videos | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 2010 Jude O'Kelly. All rights reserved