Clustering slowdown, can you help me? | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Clustering slowdown, can you help me?

Ugh where to start. I am asking you all for some help in what has become a cluster nightmare. Anyway to get started due to a penny pinching boss my setup is currently not at the top of the Microsoft HCL heap. But it was the best I could do with my budget and I feel it is a good robust system except that it doesn’t work so good <img src=’/community/emoticons/emotion-2.gif’ alt=’:D‘ />. Here it is.<br /><br />I have an Infortrend IFT-5251FS6S which I am sure you will all recognize as the makers of the Sun 3510. This is an external fiber channel redundant raid controller (6 controllers) connected to an Infortrend 16 Drive JBOD with 74 gig Seagate Fiber Drives. This system works great, in house i connected it to a couple of Dell sc400 (cockroaches) with LSI7102-XP-LC fiber cards.<br /><br />When we cluster the boxes with Windows 2003 Enterprise edition we start testing failover and moving groups, along with a few SQL Server test/stress scripts that I wrote for our environment. Failover is fast and efficient happening typically within 2 to 3 seconds and failback happening almost immediately when box becomes available. Data Access IO test with SQLIOSTRESS.EXE from Microsoft and SQLIO.EXE from Microsoft benchmark the system as being incredibly fast.<br /><br />Now the kicker is when we move our raid controller, jbod, and LSI cards into some white boxes that our ISP is leasing us we have huge problems in a slowdown of disk IO a 200 meg file took 21 minutes to copy from one drive to the next after OS clustering. The white boxes are dual Xeon 3.2’s on a Tyan 5350 mobo with 8 Gig of ram and the OS’ installed on raid 1 scsi drives raid being onboard.<br /><br />When failover is attempted on these boxes it sometimes can take up to several minutes to complete. A call to Microsoft and $260 later they tell us that this is acceptable performance. Really would any of you accept a 2 minute downtime for failover? Nu uh not me! I have tried everything possible with clean installs of OS and Sql Server 2000 Ent.<br /><br />Does anyone have any recommendations at all? I think there is some kind of hardware issue here with the Mobo but am not sure. Any advice would be helpful at this point.<br /><br />Thanks<br />Chad<br />
Why not dig from the hardware side by verifying the disks and its controllers or faulty NIC or hub on network that might contributing this slow failover. Check the PERFMON counters for CPU, memory, physical disk, process, SQL memory etc. for further assessment on the issue. Although there are no specific requirements as to how much processor power you will need for SQL Server 2000, since it is dependent upon how your application utilizes SQL Server, each cluster node should be configured with enough processors of sufficient power to handle the load for any instance that may run on the node. I presume the SQL server is equipped with latest service packs and similar between nodes. One piece of note:
Make all transactions as small as possible, and commit in logical units of work. Since a virtual server goes through the startup process, which includes going through the transaction log for each database and rolling transactions back or forward, the larger the transaction size along with a larger volume of transactions could result in a slower failover time.
Satya SKJ
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.