Preparing the SQL Server 2005 Clustering Infrastructure
If It’s Not on the Hardware Compatibility List, Don’t Even Think About It
Whether you select your own hardware, or get recommendations from a vendor, it is highly critical that the hardware selected is listed in the Cluster Solutions section of the Microsoft Hardware Compatibility List (HCL), which can be found at http://www.windowsservercatalog.com/.
As you probably already know, Microsoft lists all of the hardware in the HCL that is certified to run their products. If you are not building a cluster, you can pick and choose almost any combination of certified hardware from multiple vendors and know that it will work with Windows 2003 Server.
This is not the case with clustering. If you look at the Cluster Solutions in the HCL, you will notice that entire systems, not individual components, have to be certified. In other words, you can’t just pick and choose individually certified components and know that they will work. Instead, you must select from approved cluster systems, which include the nodes and the shared array. In some ways, this reduces the variety of hardware you can choose from. On the other hand, by only selecting approved cluster systems, you can be assured the hardware will work in your cluster. And assuming you need another reason to only select an approved cluster system, Microsoft will not support a cluster that does not run on an approved system.
In most cases, you will find your preferred hardware as an approved system. But, as you can imagine, the HCL is always a little behind, and newly released systems may not be on the list yet. So what do you do if the system you want is not currently on the HCL? Do you select an older, but tested and approved system, or do you take a risk and purchase a system that has not yet been tested and officially approved? This is a tough call. But what I have done in the past when confronted by this situation is to require the vendor to certify, on their own, that the hardware will become certified by Microsoft at some time in the future, and if the hardware is not approved (as promised) that the vendor has to correct the problem by replacing unapproved hardware with approved hardware at their cost. I have done this several times and it has worked out fine so far.
Preparing the Hardware
As a DBA, you may or may not be the one who installs the hardware. In any case, here are the general steps most people follow when building cluster hardware:
- Install and configure the hardware for each node in the cluster as if they will be running as stand-alone servers. This includes installing the latest approved drivers.
- Once the hardware is installed, install the operating system and latest service pack, along with any additional required drivers.
- Connect the node to the public network. To make things easy, name the network used for public connections as “network.”
- Install the private heartbeat network. To make things easy, name the private heartbeat network “private.”
- Install and configure the shared array or SAN.
- Install and configure the SCSI or fiber cards in each of the nodes and install the latest drivers.
Now, one at a time, connect each node to the shared array or SAN following the instructions for your specific hardware. It is critical that you do this one node at a time. By this, I mean that only one node at a time should be physically on and connected to the shared array or SAN and configured. Once that node is configured, turn it off, and then turn the next node on and configure it, and so on, one node at a time. If you do not following this procedure, you risk corrupting the disk configuration on your nodes, requiring you to start over again.
After connecting each node to the shared array or SAN, you will need to use Disk Administrator to configure and format the drives on the shared array. You will need at least two logical drives on the shared array. One will be for storing your SQL Server databases, and the other one will be for the Quorum drive. The data drive must be big enough to store all the required data, and the Quorum drive must be at least 500 MB (which is the smallest size that an NTFS volume can efficiently operate. When configuring the shared drives using Disk Administrator, it is required that each node of the cluster use the same drive letter when referring to the drives on the shared array or SAN. For example, you might want to assign your data drive as drive “F:” on all the nodes, and assign the Quorum drive letter “Q:” on all the nodes.
Once all of the hardware and software is configured, it is critical that it be functioning properly. This means that you need to test, test, and test again the hardware, ensuring that there are no problems before you begin installing clustering services. While you may be able to do some diagnostic hardware testing before you install the operating system, you will have to wait until after installing the operating system before you can fully test the hardware.
Once all of the hardware has been configured, and tested, you are ready to install Windows 2003 Clustering. I will cover this topic in a future article.