Another new toy | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Another new toy

It’s looking like a busy summer for me: we just ordered a new HP DL380 G5 / 2 x 2.33 GHz Quad Core Xeon E5345 / 16 GB RAM With a new wrinkle (for me at least): 2 x 4 Gbit Fibre channel HBAs connected to, you’ve probably guessed it, a new EMC Clarion SAN, with 25 disks dedicated to SQL Server (plus other spindles for other functions). Per JC’s comments on this site, I pretty much held out no option for the other members of my team to share the disks. Too many horror stories about email storage or god-knows-what sucking up the I/O performance. Anyhow, we should have this beast up and running in a couple weeks, and I’ll post performance stats here.
Good luck!!! Luis Martin
Moderator
SQL-Server-Performance.com All in Love is Fair
Stevie Wonder
All postings are provided “AS IS” with no warranties for accuracy.
share your knowledge and assistance
hoard your storage?
Absolutely <img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />
The new toy is in da haus. Details: SQL Server 2000 Enterprise SP4 32-bit
+ hotfixhttp://support.microsoft.com/kb/899761 (AWE Memory Allocation bug on x64 OS) Windows Server 2003 Standard 64 bit EMC SAN CX3-10 for storage, configured with 2 trays of disks dedicated to SQL Server; 20 72 GB 15K rpm drives in one meta-lun(Raid 10) provide a 300GB data volume, 4 drives for other functions, 1 hot spare. Test results follow
JC’s "Table Scan Test" spid Calls ms CPUms dpages/s MB Table MB/sec ReadsPerSec Disk MB/sec Avg-sec/IO
———– ———– ———– ———– ————- ————- ————- ————- ————- ————-
55 1 111250 495 37578.8 32661.27 293.5844 1223.137 295.4627 0.5878549
55 1 105233 686 39727.48 32661.27 310.3709 1254.911 312.4612 0.6696401
55 1 104686 704 39935.06 32661.27 311.9927 1261.458 314.0938 0.6635872
JC’s "Random Read Tests" spid Calls ms CPUms CallsPerSec RowsPerCall RowsPerSec ReadsPerSec Read MB/sec DWritesPerSec D Write MB/sec Avg-sec/IO LWritesPerSec L Avg-sec/IO
———– ———– ———– ———– ————- ————- ————- ————- ————- ————- ————– ————- ————- ————-
54 741 30030 3 24.67533 10 246.7533 242.9237 1.902004 0 0 4.088964 0 0
54 2083 30000 17 69.43333 20 1388.667 1271 9.950781 0 0 0.761133 0 0
54 1655 30000 20 55.16667 30 1655 1470.633 11.51927 0 0 0.6516693 0 0
54 1302 30000 25 43.4 50 2170 1865.1 14.60885 0 0 0.5089808 0 0
54 828 30016 30 27.58529 100 2758.529 2323.161 18.19967 0 0 0.4044915 0 0
54 324 30110 41 10.76054 300 3228.163 2638.492 20.66797 0 0 0.3504563 0 0
54 108 30126 38 3.584943 999.9722 3584.844 2864.204 22.43909 0 0 0.3220995 0 0
54 35 31423 48 1.113834 2999.886 3341.374 2548.388 19.96796 0 0 0.3623717 0 0
54 13 32283 65 0.4026887 9998.308 4026.206 3025.555 23.70418 0 0 2.389449 0 0

JC’s "In Memory Tests" spid Calls ms CPUms CallsPerSec RowsPerCall RowsPerSec ReadsPerSec Read MB/sec DWritesPerSec D Write MB/sec Avg-sec/IO LWritesPerSec L Avg-sec/IO
———– ———– ———– ———– ————- ————- ————- ————- ————- ————- ————– ————- ————- ————-
53 26014 30000 59 867.1334 10 8671.333 0 0 0 0 0 0 0
53 54278 30000 333 1809.267 30 54278 0 0 0 0 0 0 0
53 50524 30000 849 1684.133 100 168413.3 0 0 0 0 0 0 0
53 5635 30000 939 187.8333 1000 187833.3 0 0 0 0 0 0 0

Anecdotally, for example during a restore from backup or an index creation, Perfmon reports 145 MB/sec write and 320-346 MB/sec read for the data volume.
For comparison with a direct-attached-storage machine with similar profile, seehttp://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=21246
i think 346MB/sec is near the limit of 1 x 4Gb/s FC port
but it could also be the limit for 20 disks in a SAN
the older line of Clarion could only do 10MB/sec per disk,
you are getting 17MB/sec the random read of 2300-3025/sec is a little low,
working out to 115-150/sec per disk
but the latency is great, < 1ms/read (it should be milli-sec, not sec, my mistake) I am sure the data set is larger than what fits on the SAN cache,
so it cannot be in-cache results,
I am not sure why IOPS are about right for queue depth 1,
direct attach disks would show high-queue behavior, very high IOPS at higher latency 15-20ms
but why is the disk latency so low?
The first few results of the "in memory tests" are concerning; not sure what’s going on there. Small result sets/rows per call is a lot slower than I expect, and slower than on another similar box.
yes, you are right
was SET NOCOUNT ON applied?
Hi Joe – yes, I believe so. I took the test scripts directly from your other post. I am puzzled; everything else makes sense to me, but I can’t quite figure out whats going on there.
try openning 4 windows in QA consecutively, ie, they have seqeuential spids, 51,52,53,54 run the test in each, one after the other, it could be the management/monitoring software is disrupting one of the processors,
and if your QA window is on it, too bad for you
New info: just installed the license for EMC’s "powerpath" module, which should be load-balancing across the two HBA’s and the two EMC storage processors. I see immediately some improvement. New "Random Read" test results are better: spid Calls ms CPUms CallsPerSec RowsPerCall RowsPerSec ReadsPerSec Read MB/sec DWritesPerSec D Write MB/sec Avg-sec/IO LWritesPerSec L Avg-sec/IO
———– ———– ———– ———– ————- ————- ————- ————- ————- ————- ————– ————- ————- ————-
56 1064 30000 6 35.46667 10 354.6667 237.1333 1.860677 0 0 4.170508 0 0
56 2390 30016 21 79.6242 20 1592.484 967.5173 7.586058 0 0 1.000895 0 0
56 1657 30003 21 55.22781 30 1656.834 979.0021 7.676576 0 0 0.9856671 0 0
56 1238 30000 130 41.26667 50 2063.333 1198.267 9.394531 0 0 0.8029932 0 0
56 898 30000 120 29.93333 99.99777 2993.267 1699.933 13.33307 0 0 0.5521785 0 0
56 377 30156 64 12.50166 300 3750.497 2094.807 16.43382 0 0 0.4446028 0 0
56 123 30033 74 4.095495 999.9756 4095.395 2255.585 17.69512 0 0 0.4093472 0 0
56 51 30343 92 1.680783 2999.726 5041.888 2864.219 22.46451 0 0 0.314651 0 0
56 17 31060 88 0.5473278 9998.353 5472.376 2972.923 23.31651 0 0 2.325929 0 0

Table scan performance is a smidge better (don’t know if I can attribute that to powerpath or some other factor)
spid Calls ms CPUms dpages/s MB Table MB/sec ReadsPerSec Disk MB/sec Avg-sec/IO
———– ———– ———– ———– ————- ————- ————- ————- ————- ————-
55 1 105390 488 39668.3 32661.27 309.9086 1253.041 311.9957 0.6695946
55 1 104533 615 39993.52 32661.27 312.4493 1263.314 314.5536 0.6730401
55 1 104593 652 39970.57 32661.27 312.2701 1262.589 314.3731 0.6714727
quote:Originally posted by joechang try openning 4 windows in QA consecutively, ie, they have seqeuential spids, 51,52,53,54 run the test in each, one after the other, it could be the management/monitoring software is disrupting one of the processors,
and if your QA window is on it, too bad for you

Good idea; I tried that but with little change to the results. Is it possible that AWE overhead could cause this?
i just realized my scripts do not start with
SET NOCOUNT ON are you running QA local or remote put the SET NOCOUNT ON
at the top
check the messages to be sure NOCOUNT is on
Joe – I located the problem: the server is fine; I had not noticed the following in the test script:
In the script called "new in memory read test," only the first test loop ( while datediff( … ) < 30000 etc. ) contains an explicit transaction, and the other test loops further down do not. Commenting out the BEGIN TRAN and COMMIT makes the first loop match the others, and test results are more like expected:
spid Calls ms CPUms CallsPerSec RowsPerCall RowsPerSec ReadsPerSec Read MB/sec DWritesPerSec D Write MB/sec Avg-sec/IO LWritesPerSec L Avg-sec/IO
54 45411 30076 102 1509.875 10 15098.75 0 0 0 0 0 0 0
54 41592 30046 250 1384.277 30 41528.32 0 0 0 0 0 0 0
54 34937 30093 619 1160.968 100 116096.8 0 0 0 0 0 0 0
54 5640 30000 940 188 1000 188000 0 0 0 0 0 0 0

]]>