Optimizing Raid & File of an existing SQL2K on SAN | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Optimizing Raid & File of an existing SQL2K on SAN

Hello, I have to upgrade an existing SAN to carry on a quickly growing database and not so good performances. Behind this DB is hidden an international 24×24 Financial Consolidation application which generates a huge load. The application bought from my company is not very optimized in a SQL point of view. The actual configuration was build 2 years ago according to the specification given by the application provider, specifications which are under evaluated now. So the actual problematic database situated on a IBM Fast T700 SAN:
– ~55 Go Data – RAID 10 (4 HD FC 68Go 10000)
– <1 Go Log – RAID 1 (2 HD FC 68Go 10000)
– 136 Go Backup volume – RAID 10 (4 HD FC 68Go 10000) Its characteristics:
– 1 file holding all the data
– Database in Simple mode because the backup volume could not handle the daily volume of transaction (>136 Go in Bulk Mode & >1To on Full mode)
– Data grows 35 Go per year
– 1 index REBUILD per week
– 1 Full backup per day It’s WEAKNESS:
– Consolidations tasks generate thousands of SQL tables per day: real table, not temporary ones => amount of logs. The "temporary" tables are dropped at the end of each task.
– Too few utilisation of CLUSTERED indexes => heavy table fragmentation
– Not sure if it comes from the too high I/Os or from the multi-user application architecture but many blocking lock every day (30-100 no longer than a few seconds) The driving is provided by a 2 nodes cluster 4xPIV HT (8 logical CPU) 2 GHz with 8 Go memory (/PAE + AWE) under Windows 2003. The node carrying the database is dedicated. The global CPU stays between 30 & 50%, which seems to be ok. I made some perfmon measurement and calculations using the following rule:
Ideal RAID 10 HD NB = MAX( Disk Reads/sec+(2xDisk Reads/sec) ) / 1 HD theoretical max I/Os The maximum peak made me frighten (with 239 max theoretical I/Os per HD): 22 for Data (~5230 I/O mainly write) & 12 for logs (~3360 I/O full write). During 24h the overloaded I/O represent ~5% for the data and 11% for the logs (according to the actual configuration). So I#%92m looking after ways to decrease this overload without wasting too much hard drives. Building a 12 HD RAID 10 volume = a 384 Go drive only for storing a 1 Go file make me feel [xx(]. Increasing the Data number of HD will increase the Log Load too. I know putting together Data & Logs on a large RAID 10 wouldn’t be a good idea. Maybe a RAID10 with 2 logical volumes, but I#%92m not sure about the performance of simultaneous random & sequential I/O on the same array. I#%92m able to move the ‘#%92false#%92#%92 temporary table on a second Data file located on the backup drive as to get a better I/O distribution (these tables are not used during the 20 min nightly backup).
The last but not the least, I do not have a spare SAN for testing purpose. So any experience or idea is welcome.
Regards, Romain
Affecting sql code or index design is not the option?
quote:- ~55 Go Data – RAID 10 (4 HD FC 68Go 10000)
– <1 Go Log – RAID 1 (2 HD FC 68Go 10000)
– 136 Go Backup volume – RAID 10 (4 HD FC 68Go 10000)
Go is actually Gb, right?
quote:?- Database in Simple mode because the backup volume could not handle the daily volume of transaction (>136 Go in Bulk Mode & >1To on Full mode)
You can handle it by scheduling frequent (e.g. each 15 minutes) transaction log backups. For hardware and RAID configuration you will have answer from peers that are more experienced with it. Edited: Now I see it is 136 gb – 1 Tb daily. That is really too much. I believe the main problem is db and process bad design and/or implementation. You would gain much more if you can affect process and db design.
The application software is not developed by my company. I don#%92t know why they did choose this architecture, if they did not know the benefits of Clustered index to fragmentation or if the inserts were too slow for overall performance. So we can tell them their product may achieve more speed using it, but it#%92s their choice. According to my experience with this product and company, I#%92m really not sure we may get this kind of optimization before a long time. It may not suit their other client#%92s needs too. Sorry for my English, Go is actually GB (I#%92m French speaking). At this time, we did choose the Simple mode, so the transaction logs are automatically truncated. I had to add a scheduled job with a “backup log database_xxx with truncate_only”, because of application locks sometimes blocking the automatic truncate. Maybe the heavy load. Actually the consolidation tasks handle between 10 – 400 MB data, clone it to “false” temporary table, apply financial rules on it and replace the original data at the end of the calculation. The application provider told us they did choose this way as the application users may still be able to consult the old data during the calculation. Depending to the complexity of the calculations, the tasks may last 1 min to several hours. Most of the time less than 20 min.
the ~200 IO per hard drive really applies to random IO at low queue depth.
a single disk can drives >5K small block sequential IO/sec, so you really don’t need to 12 drives for the log. but i would recommend 4-6 drives to allows for better performance during log backups, which disrupts the normal sequential disk op. also, the IOPs per disk is much more complicated than a simple 240/disk. i will post my slides on this soon after SQL Connections Mar 20-23.
Actually, we did choose 64kB Raid cluster size for all volumes as to achieve maximum backup speed. I played with the SAN “Read-ahead cache” and decreased 20 min to the whole backup process which now lasts 40 min. The backup process was build from a maintenance plan -> check + backup + restore verify only. The backup process needs not to be too long, because the increased load generates application trouble and we need to transfer the SQL backup to another server after. The SAN has got two active/active FC controllers with 1GB cache each. One is dedicated to the SQL Data array which generates most I/Os and the second one to Log + Data + File server (managed by the second cluster node). Today I left the IBM FastT700 Performance Monitor running and got these numbers : ______Total IOs___Read_____CacheHit____Max KB/s____Max IO/sec
Log____3950000______0%________15%_______22800_______1740 Backup -> will get more interesting numbers Monday after nightly full backup (no backup occurred during these logs collection).
even if you don’t control the application, it should still be possible to change the indexes, if you are willing to give this a try. you could even build custom indexes for the consolidation work. SQL Server does not use indexes correctly in large updates, i gave a session at SQL Connections on this last year. i am not sure you actualy have a storage problem depending on how sustained the MAX IO/sec values are.
also note the the MAX KB/sec of 175MB/sec is about as much as can go through 1 2Gbit/sec FC, if you need more, you need a second FC adapter
I’ve just sent some questions to the application provider about Clustered index, Derived Table and Table Variable. At least, I guess CLUSTERED INDEX should be used at the end of the calculation. After this operation, consolidated financial data should only be used as "read only" data. So this kind of index shouldn’t have a too negative effect if created at the end of the task.<br /><br />I made a check on the different "Disk Queue Length" counters of the Data, Log and Backup volumes (24h collection every 5 sec). Log and Backup should be ok as they never rise above 2. Seems that even if the I/Os raise high on these volumes, they can handle it (good use of memory cache probably). The Data volume is more worrying as the "Avg. Disk Read Queue Length" often rises above 200 (one peak above 500). During 24 hour, the longest peak seems to last about 1 min, the others less than 10 sec.<br /><br />I extracted the 239 IOPs from the ST373307FC disk documentation:<a target="_blank" href=http://www.seagate.com/support/disc/manuals/fc/100195490b.pdf>http://www.seagate.com/support/disc/manuals/fc/100195490b.pdf</a><br />As you told this number should be taken at IO low queue, my calculation may be a little wrong [<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />]<br /><br />The FastT700 Performance Monitor ran the entire week-end. The Backup volume numbers seems to be quite good: ~145MB/s with ~2300 I/Os max – 86% Read Cache hit.