Sample SQL Server Disaster Recover Plans

This document outlines the tasks needed to invoke a contingency plan for our production SQL Server. This would be required if the primary server became unavailable. It also covers the tasks needed to restore the service back to the default state of running on a Primary and Contingency server.

Servers involved:

SERVERNAME         Primary server

SERVERNAME         Contingency server

Store all the SQL scripts involved in the Disaster Recovery test on both Primary and DR SQL Servers.

Process to Fail Over from Primary to Contingency SQL Server

Step

Task

Confirm no issues with current SQL Server.

1 DR Process

LIVE SERVER

Keep required databases to DBO Use Only.

Stop / Start MSSQLServer & MSSQLAgent services.

CHECKPOINT – Ensure required service is stopped and Primary server no longer accessible to application.

2

LIVE SERVER

Ensure to disable required jobs on Primary server and enable on Contingency server.

3

LIVE SERVER

Stop / Start MSSQLServer service to clear user connections.

4

DR SERVER

Execute RESTORE statement WITH RECOVERY to keep databases online and usable state. Execute any scripts involved for the logins.

Perform full backup for all user databases on Primary Server and restore them on contingency SQL server.

5

DR SERVER

Execute any scripts involved to drop all the users.

Create the users with the scripts involved.

Ensure the application and group logins have been allocated to correct roles and databases.

Disable DBO Use only for any user databases.

CHECKPOINT – Ensure  databases on Contingency SQL server are accessible by the application.

At this point the SQL Server DR SERVER will be active and can be accessed from the application. Ensure no issues with the application.

CHECKPOINT – Refer SQL server error logs, event viewer logs and any other application logs involved. For any reason if the database is not accessible from application try to fix it, if unable to fix then abandon the whole process of Disaster Recovery test.

LIVE SERVER

Ensure there are no users connected to Primary Server by referring to Current activity .

DR SERVER
Check SQL / Event logs to ensure no application connectivity issues.

CHECKPOINT – DR test complete for swapping roles from PRIMARY to CONTINGENCY.

Process to fail back from Contingency to Primary SQL Server

1 DR Process

DR SERVER

Keep required databases to DBO Use Only.

Stop / Start MSSQLServer & MSSQLAgent services.

CHECKPOINT – Ensure required service is stopped and Contingency server no longer accessible to application.

2

DR SERVER

Ensure to disable required jobs on Contingency server and enable on Primary server.

3

DR SERVER

Stop / Start MSSQLServer service to clear user connections..

4

LIVE SERVER

Execute RESTORE statement WITH RECOVERY to keep databases online and usable state. Execute any scripts involved for the logins.

Perform full backup for all user databases on DR Server and restore them on Live SQL server.

5

LIVE SERVER

Execute any scripts involved to drop all the users.

Create the users with the scripts involved.

Ensure the application and group logins have been allocated to correct roles and databases.

Disable DBO Use only for any user databases.

CHECKPOINT – Ensure databases on Live SQL server are accessible by the application.

At this point the SQL Server LIVE SERVER will be back online and can be accessed from the application. Ensure no issues with the application.

CHECKPOINT – Refer SQL server error logs, event viewer logs and any other application logs involved.

DR SERVER

Ensure there are no users connected to DR Server by referring to current activity .

LIVE SERVER
Check SQL / Event logs to ensure no application connectivity issues.

CHECKPOINT – DR test complete for swapping roles from CONTINGENCY to PRIMARY.

The process of disaster recovery is complete. Update necessary documentation as necessary.

Continues…

Leave a comment

Your email address will not be published.