Selecting the correct hardware, and preparing it correctly, is half the battle when clustering Windows 2000. As a rule of thumb, if you have done a good job with the hardware, you should not have any difficulty installing Windows 2000 Cluster Services. In fact, it should be a non-event. But if you don’t have the correct hardware installed and configured correctly, you could spend hours, if not days, troubleshooting why you can’t install Cluster Services.
The focus of this section is to highlight some of the most important considerations when selecting and configuring hardware for a SQL Server cluster. Because there are so many different possible hardware configurations possible for Windows 2000 clustering, there is no way it is possible in a single article to cover all of the potential issues. But for the most part, the topics discussed in the article will cover 90% or more of the “big” issues when it comes to Windows 2000 clustering hardware.
Hardware Requirements for Clustering
Each node in the cluster must have this required hardware:
- A physical disk to hold Windows 2000 Advanced Server or Windows 2000 Datacenter Server. This disk must not be connected to the storage bus used to connect to the shared array. Ideally, this drive should be connected to its own SCSI card, and should be mirrored for best fault tolerance.
- A SCSI or Fibre Channel adapter to be used to connect to the shared disk array. Other than the shared disk array, no other devices should be attached to this adapter.
- Two PCI network adapter cards: one for the public network and one for the private network.
- RAM requirements will vary, depending on the needs of SQL Server.
- CPU requirements will vary, depending on the needs of SQL Server.
Each of these will be discussed in more detail later in this article.
Active/Active vs. Active/Passive Configuration
Before we go any further, you need to know if you will be running your SQL Server cluster in an Active/Active or in an Active/Passive configuration. The answer to this question will affect how you need to size the server’s belonging to the cluster. (Note: This articles focuses on the two-node cluster design, although most of the information in this article also is applicable to a four-node cluster design).
An Active/Active configuration means that you will be running SQL Server on both nodes of the cluster. So if one node fails over to the other node, then the node will be running two instances of SQL Server, instead of one instance.
An Active/Active configuration has two major implications for your cluster hardware. First, ideally, each node in the cluster must be large enough in capacity to run both instances of SQL Server. While you may not need to size each node to be fully twice as big as necessary in order to handle the double load when a failover occurs, they do need to large enough to handle both loads well enough so that user productivity doesn’t suffer unnecessarily when a failover does occur. Second, an Active/Active configuration requires that the shared disk array have available at least two separate disks , one for each of the active nodes, in addition to a shared disk for use as the cluster quorum drive.
In an Active/Passive configuration, SQL Server is only running on a single node of the cluster. So if the primary node (the node that is currently running SQL Server) fails, the the other node takes over. Assuming both servers are have the same hardware, there should be no performance issues after a failover. Because there is only one instance of SQL Server running, each server only needs to be as big as necessary to handle a single SQL Server’s needs.
Sizing the Cluster’s Servers
For the most part, you size the servers in a cluster like you would size them for a SQL Server not in a cluster. The number of CPUs, the amount of RAM, and the size and type of the disk arrays you need is dependent on how big the database is, and how active it is. While a cluster does offer some overhead not found in a non-clustered SQL Server, this overhead is minimal. Memory is the one item that clustering will use the most of, so when you do size RAM for your servers, be sure you not only include enough for SQL Server and its databases, but include some extra from the needs of Windows 2000 and clustering services. In most cases, 128-256MB of RAM, above and beyond, what SQL Server will need.
As has already been mentioned above, an Active/Active SQL Server configuration requires that each server in the cluster be oversized in order to ensure that they have enough resources to simultaneously run two instances of SQL Server under production conditions.
If the nodes in your cluster will be using AWE memory (more than 4GB or RAM in each server), it is critical that both nodes have the identical amount of RAM, configured identically. If not, then failover most likely will fail.
The Shared Array
The shared array is one of the most critical components of a Windows 2000 cluster. It is the disk storage that will be shared between the two nodes in the cluster. It will hold the important quorum drive (used by the Windows 2000 Clustering Service), along with the one or more drives or drive arrays that will hold the shared databases. The shared array is connected to the two nodes in the cluster via a shared bus, either SCSI or Fibre Channel.
In an Active/Active configuration, you will not only need a quorum drive, but at least two shared drives. One of these shared drives will be used by the first active node, and the second shared drive will be used by the second active node. Two nodes cannot control a single shared disk array at the same time. And since an Active/Active configuration is running two separate instances of SQL Server, each instance must have exclusive access to its own shared disk or disk array. On the other hand, any single node in a cluster can access more than one disk array at the same time, as long as that node is the exclusive owner of that one or more disk arrays.
In an Active/Passive configuration, only a quorum drive and one shared disk array is required as a minimum configuration. If necessary, more than one shared disk array can be owned by the active node of a cluster.
Most shared disk arrays are in the form of a self-contained disk subsystem or a Storage Area Network (SAN), connected to the nodes in the cluster with either SCSI or Fibre Card connections and cables. Based on my experience, this is the area where most of the hardware configuration problems occur. There are many different types of SCSI and Fibre equipment, and they all must be configured differently. You will want to take special care when following Microsoft’s and the vendor’s instructions on how to configure shared array components for your cluster. This is especially true for SCSI connections, which require appropriate SCSI IDs must be set and terminations made.
Most shared disk arrays allow you to configure RAID arrays for the best fault tolerance. For best performance, select a RAID 10 configuration. If this is not available or affordable, then select RAID 5.