This document outlines the tasks needed to invoke a contingency plan for our production SQL Server. This would be required if the primary server became unavailable. It also covers the tasks needed to restore the service back to the default state of running on a Primary and Contingency server.
Servers involved:
SERVERNAME Primary server
SERVERNAME Contingency server
Store all the SQL scripts involved in the Disaster Recovery test on both Primary and DR SQL Servers.
Process to Fail Over from Primary to Contingency SQL Server
Step |
Task |
Confirm no issues with current SQL Server. |
|
1 DR Process |
LIVE SERVER Keep required databases to DBO Use Only. Stop / Start MSSQLServer & MSSQLAgent services. |
CHECKPOINT – Ensure required service is stopped and Primary server no longer accessible to application. |
|
2 |
LIVE SERVER Ensure to disable required jobs on Primary server and enable on Contingency server. |
3 |
LIVE SERVER Stop / Start MSSQLServer service to clear user connections. |
4 |
DR SERVER Execute RESTORE statement WITH RECOVERY to keep databases online and usable state. Execute any scripts involved for the logins. Perform full backup for all user databases on Primary Server and restore them on contingency SQL server. |
5 |
DR SERVER Execute any scripts involved to drop all the users. Create the users with the scripts involved. Ensure the application and group logins have been allocated to correct roles and databases. Disable DBO Use only for any user databases. |
CHECKPOINT – Ensure databases on Contingency SQL server are accessible by the application. |
|
At this point the SQL Server DR SERVER will be active and can be accessed from the application. Ensure no issues with the application. |
|
CHECKPOINT – Refer SQL server error logs, event viewer logs and any other application logs involved. For any reason if the database is not accessible from application try to fix it, if unable to fix then abandon the whole process of Disaster Recovery test. |
|
LIVE SERVER Ensure there are no users connected to Primary Server by referring to Current activity . |
|
DR SERVER |
|
CHECKPOINT – DR test complete for swapping roles from PRIMARY to CONTINGENCY. |
Process to fail back from Contingency to Primary SQL Server
1 DR Process |
DR SERVER Keep required databases to DBO Use Only. Stop / Start MSSQLServer & MSSQLAgent services. |
CHECKPOINT – Ensure required service is stopped and Contingency server no longer accessible to application. |
|
2 |
DR SERVER Ensure to disable required jobs on Contingency server and enable on Primary server. |
3 |
DR SERVER Stop / Start MSSQLServer service to clear user connections.. |
4 |
LIVE SERVER Execute RESTORE statement WITH RECOVERY to keep databases online and usable state. Execute any scripts involved for the logins. Perform full backup for all user databases on DR Server and restore them on Live SQL server. |
5 |
LIVE SERVER Execute any scripts involved to drop all the users. Create the users with the scripts involved. Ensure the application and group logins have been allocated to correct roles and databases. Disable DBO Use only for any user databases. |
CHECKPOINT – Ensure databases on Live SQL server are accessible by the application. |
|
At this point the SQL Server LIVE SERVER will be back online and can be accessed from the application. Ensure no issues with the application. |
|
CHECKPOINT – Refer SQL server error logs, event viewer logs and any other application logs involved. |
|
DR SERVER Ensure there are no users connected to DR Server by referring to current activity . |
|
LIVE SERVER |
|
CHECKPOINT – DR test complete for swapping roles from CONTINGENCY to PRIMARY. |
|
The process of disaster recovery is complete. Update necessary documentation as necessary. |