How SQL Cluster Resource DLLs work with Cluster Components to Improve the Failover Process

LooksAlive
As shown in above figure, LooksAlive call is executed from the SQAGTRES.DLL which in turn only checks the SQL Services such as SQL Server Agent. It does the following to check the SQL Agent Server resource:

a. Make a call to Service Control Manager (SCM) to check the registry entries of SQL Server Agent and also make sure that service is up and running. SCM uses its own internal mechanism to check the status of the server. The status returned by the SCM to SQAGTRES.DLL may be either FALSE or TRUE. SQAGTRES.DLL decides the next action on basis of the status returned by the SCM.
b. After SCM, the SQL Server Agent resource is checked in the cluster database with the help of Resource Monitor.

LooksAlive runs under the security context of Cluster Service Account to check the status of SQL Server Agent resource. LooksAlive call doesn’t guarantee that the SQL Virtual Server, system database and SQL instance are operational. This is taken care by the IsAlive call which is discussed in the next section.

IsAlive
The IsAlive is more detailed check against all the SQL resources. SQL’s IsAlive in SQSRVRES.DLL is implemented in such a way that it performs all the checks for SQL Resources. It checks to make sure all the: 

  • SQL resources are online.
  • SQL resources are configured with correct dependencies.
  • Dependent SQL Resources are online.
  • The registry entries for SQL resources are configured correctly.
  • SQL Virtual Server Instance is operating normally.
  • SQL System Databases are functional by performing a SQL Transact query.  

 

 

Note: The IsAlive call doesn’t perform any check against the user databases.
 
Please note that Resource Monitor executes IsAlive and LooksAlive queries against the whole Cluster Group. It is the responsibility of the Resource DLL (SQSRVRES.DLL) to execute its own IsAlive and LooksAlive against its resources.

When you setup SQL cluster for the first time, the Cluster Service running on the node takes a snapshot of the cluster configuration and saves in HKLMCluster key as shown in Figure 1.1. This Key contains the cluster configuration such as resources name, their GUID, node holding the resources and status. This is generally called cluster configuration database. As an example, for SQL it includes the following resources:

Resource Name GUID Node Name Status Flags
 SQL Server Network Name  {GUID1}  Node1  Online   1
 SQL Server IP Address  {GUID2}  Node1  Online  1
 SQL Server  {GUID3}  Node1  Online  1
 SQL Server Agent  {GUID4}  Node1  Online  1
 SQL Server Full-Text  {GUID5}  Node1   Online  0
 MSTDC  {GUID6}  Node1  Online  0

Before Resource Monitor executes any cluster function against the SQL Cluster Group, it decides to look at the cluster configuration database to check the status of all the resources and their GUIDs. For example, there is a cluster group by name “SQLVS”. All the SQL resources reside in this group. When IsAlive interval expires, the Resource Monitor executes the IsAlive call against the “SQLVS” Cluster Group. It passes the Resource GUID and the Status of SQL resources to SQL Resource DLL (SQSRVRES.DLL). SQSRVRES.DLL in turn executes IsAlive call to check the resources availability. Please note that SQSRVRES.DLL doesn’t really know about the status of SQL Resources. It is the Resource Monitor who supplies this information to SQSRVRES.DLL at the time of executing IsAlive queries.

Next functions are Open, Close, Online and Offline. These functions are called whenever the SQL Resources are moved or taken offline/online or when there is a need to call them. For example, you might want to take SQL Resources offline for maintenance purpose on a node. In that case, Resource Monitor executes Offline function and in turn SQSRVRES.DLL executes the Offline function to take the resources offline. We will discuss these functions later in this article with an example. As a whole, these functions are executed by the Cluster Service and supported by SQL Resource DLL. That’s why SQL 2000 and later versions are known as a pure cluster-aware application!

Resource Monitors determine the state of resources by checking the flag value in the registry. This value could be either 1 or 0. 1 is for Online and 0 is for Offline. If you stop an SQL Service on a cluster node from the Cluster Administrator, the value 0 is set for that service or resource in the registry. If you stop the service using command line or any other tool, the value is not set. It is left intact because this operation occurred out of the cluster operation. Any operation occurred out of the cluster doesn’t reflect any changes to the cluster configuration database. In this case, the IsAlive query may not function correctly because in next interval, IsAlive will not take any action on stopped service because the value supplied by the Resource Monitor indicates that the SQL Service or resource is already online. Please note these values are not maintained by the SQL Resource DLL rather maintained by the Resource Monitor and supplied at the time of performing IsAlive queries.

Continues…

Leave a comment

Your email address will not be published.