sql server 2000 cluster failover attempt failed. | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

sql server 2000 cluster failover attempt failed.

I have an active/passive 2-node cluster running SQL Server 2000
on windows 2003 standard Edition with a default instance. Everything is working perfectly fine I have tested failover from both nodes all seems fine.
Now I installed sp4 from node1 and after successful installation, I rebooted node2 first and try to move to node2 before rebooting node1, but the failover attempt failed. So I rebooted node1 now node1 is active and working fine then I rebooted again node2 and try to move to node2 till the attempt failed.
(Basically now on node2 it#%92s not working) In the event viewer I see the following three error
messages at the time of the failover attempt: Error 9004, serverity 21, state 10 (event id 17055) [sqsrvres] StartResourceServic e: Failed to start MSSQL$SERVER service. CurrentState: 1 [sqsrvres] OnlineThread: ResUtilsStartResour ceService failed (status 435) [sqsrvres] OnlineThread: Error 435 bringing resource online.
All three errors have event ID of 17052 I have done a complete search but could not able to find any solution I have checked the version on sqlsver.exe on both the nodes it#%92s the same and its not “readonly” Please guide me with your finds.. Please this is bit urgent. Thanks
Sunil

I am not directly familiar with this error message. Have you tried the following yet? Turn off both nodes of the cluster, then turn on node A and see if it comes up, and if it does, then bring node B up and see if it comes up. If this does not work, then check the logs once again for any clues. If you can’t figure anything else out, then consider removing SQL Server from both nodes and reinstalling it. In my experience, if clustering is not working, it is hard to fix, and often it is much easier to start over. —————————–
Brad M. McGehee, SQL Server MVP
The 9004 error indicates an error when trying to process a log (recover). Is all your hardware intact; do both clusters see the drives in the same order/mapping? You might be getting out your system database backups and restoring them… The sql server should have an entry in the event viewer (or the sql server errorlog file) indicating which log it could not recover and you can work from there as a starting point.
quote:Originally posted by bradmcgehee I am not directly familiar with this error message. Have you tried the following yet? Turn off both nodes of the cluster, then turn on node A and see if it comes up, and if it does, then bring node B up and see if it comes up. If this does not work, then check the logs once again for any clues. If you can’t figure anything else out, then consider removing SQL Server from both nodes and reinstalling it. In my experience, if clustering is not working, it is hard to fix, and often it is much easier to start over. —————————–
Brad M. McGehee, SQL Server MVP
Hi Brad M. McGehee, I havn’t tried trun off both the nodes at a time.. but i tries restarting both nodes one after other In node1 everything is working perfectly fine, when i try to move cluster group to node2 its works and when i try to move sql group slq server service fails and this group will move back to node1, this is the present scenario.
When sql is working on node1 which is active, i can connect to node2 and access sql, it will work fine.. only the failover to node2 is not working and give the above error.

Is this a new cluster build, or has this cluster been in production for a while and you have just added the Service Pack? If this is a new cluster build, I would suggest you take Haywood’s advice and check to ensure that your hardware is correctly configured and working. Also, if you haven’t tried turning both servers off and then back on, you need to try it. —————————–
Brad M. McGehee, SQL Server MVP
]]>