SAN vs. Local Storage
If you check out the SQL Server newsgroups, you often see questions about whether a SAN or local storage is the best option. Just like more answers, it depends. A properly configured SAN can be faster than locally-attached storage, and offer other benefits, such as better redundancy, more flexible storage options, virtualization, and more. On the other hand, a poorly configured SAN can result is very poor IO performance as compared to a properly configured locally-attached storage.
Then there is the factor of cost. Assuming cost is a priority, then a SAN is probably out of the question. But if performance and storage redundancy and flexibility is more important than cost, then consider a SAN. The key to great SAN performance is a correct configuration, which often requires a SAN expert to implement.
Total Amount of Available Drive Space on Server
While the performance effect isn’t huge, it is important that all of your disk arrays have at least 20% of free space. This is because NTFS (which is the disk format I assume you are using) needs extra space to work efficiently. If the space is not available, then NTFS is not able to function at its full capacity and I/O performance can degrade. It also leads to more disk fragmentation, which causes the server to work harder to read and write data.
Take a look at each of the physical disks in your SQL Server, checking to see if there is at least 20% or more of free space. If there isn’t, then consider trying:
• Removing any unnecessary data from the disks (empty the recycle bin, remove temp files, remove setup files, etc.)
• Moving some of the data to disks with more space
• Adding more disk space
Total Number of Physical Drives in Each Array
A disk array generally refers to two or more physical disk drives working together as a single unit. For example, a RAID 5 array might have four physical drives in it. So why is it important to know how many physical drives are in the one or more arrays in your SQL Server?
With the exception of mirrored arrays (which are two physical drives working together), the more physical drives that are in an array, the faster reads and writes are for that array.
For example, let’s say that I want to purchase a new SQL Server with a RAID 5 array and that I need at least 800 GB of available space. Let’s also assume that the vendor has proposed two different array configurations:
5 – 200 GB drives (800 GB available)
9 – 100 GB drives (800 GB available)
Both of these options meet our criteria of providing at least 800 GB of RAID 5 disk space. But which array will provide better read and write performance? The answer is the second choice, the nine 100 GB drives. Why?
Generally speaking, the more disks that are in an array, the more disk heads there are available to read and write data. SCSI and fiber-channel drives, for example, have the ability to read and write data simultaneously. So the more physical drives that there are in an array, the faster data is read or written to the array. Each drive in the array shares part of the workload, and the more, the better. There are some limits to this, depending on the disk controller, but generally, more is better.
So what does this mean to you? After you take a look at the number of arrays you have in your SQL Server, and the number of drives in each array, is it feasible to reconfigure your current arrays to take better advantage of the principal of more is better?
For example, let’s say that your current server has two disk arrays used to store user databases. Each is a RAID 5 array with three 100 GB drives each. In this case, it might be beneficial to reconfigure these two arrays into a single array of six 100 GB drives. Not only would this provide faster I/O, but it would also recover 200 GB of hard disk space.
Take a careful look at your current configuration. You may, or may not be able to do much. But if you can, you will be able to see the benefits of your change as soon as you make them.
RAID Level of Array Used for SQL Server Databases
As you probably already know, there are various different types of disk array configurations, called RAID levels. Each has their pros and cons. Here is a brief summary of the most commonly used RAID levels, and how they can be best used in your SQL Server:
The operating system and SQL Server executables, including the operating system’s swap file, are often located on a RAID 1 array. Some people locate the swap file on its own RAID 1 array, but I doubt that this really offers much of a performance boost because paging, on a well-configured server, is not much of an issue. This option provides a good compromise between performance, fault tolerance, and cost.
If your SQL Server database(s) are very small, and all the databases can fit on a single disk drive, consider RAID 1 for the storing of all your SQL Server data files, especially if cost is an issue.
Another way to use RAID 1 arrays is to separate each transaction log on its own RAID 1 array. This is because transactions logs are written to and read from sequentially, and by isolating them to their own array, sequential disk I/O won’t be mixed with slower random disk I/O, and performance is boosted. If you can’t put each transaction log on its own RAID 1 array, you also have the option of putting all of your transaction logs on a single RAID 1 array. This is not an ideal situation, but is better than mixing your database and log files on the same array.
Although this is the most popular type of RAID storage, it is not the best option for optimum SQL Server I/O performance. If a database experiences more than 10% writes, and most OLTP databases do, write performance will suffer, hurting the overall I/O performance of SQL Server. RAID 5 is best used for read-only or mostly read-only databases. Testing at Microsoft has found that RAID 5 can be as much as 50% slower than using RAID 10. On the other hand, RAID 10 is expensive, and your budget many only support a RAID 5 array option. RAID 5 is also a poor choice for storing transaction logs, as transactions logs are very write intensive.
RAID 10 offers the best performance for SQL Server databases (read and write), although it is the most expensive RAID option. The more write intensive the database, the more important it is to use RAID 10. RAID 10 arrays are also a good option for transaction logs.
Most likely, your current SQL Server configuration does not match the recommendations above. In some cases, you may be able to modify your current array configuration to come closer to what is recommended above, but in most cases, you will probably have to live with what you have until you get a new budget for a new server and array(s).
If you can only do one of the above recommendations, I would recommend that you move to RAID 10 over the other options. This option, above all others listed above, will give you the greatest overall boost in SQL Server I/O performance.
Hardware vs. Software RAID
RAID can be implemented through hardware or software (via the operating system). There is no debate on this topic, don’t ever user software RAID, it is very slow. Always use hardware RAID.
Disk Fragmentation Level
If you create a new database on a brand new disk array, the database file and transaction log file created will be one contiguous file. But if your database or transaction log grows in size (and what database and transaction log doesn’t) over time, it is possible for the files to become physically fragmented over time. File fragmentation, which scatters pieces of your files all over a disk array, causes your disk array to work harder to read or write data, hurting disk I/O performance.
As part of your performance audit, you need to find out how defragmented your SQL Server database and transaction logs are. If you have Windows 2000 or 2003, you can use the built-in defragmentation utility to run a fragmentation analysis to see how badly the files are fragmented. Or, you can use a third-party defragmentation tool.
If the analysis recommends that you defragment, you should. Unfortunately, defragmenting a SQL Server’s database and transaction log files is not always an easy task. Open files, such as those database and transaction log files found on a running SQL Server, cannot always be defragmented. For example, the built-in defragmentation utility cannot defrag SQL Server MDF and LDF files, but Diskeeper can in many cases, but not all. This means, than is some cases, you may have to bring SQL Server offline in order to defrag MDF and LDF files. And depending on how fragmented the files are, and the size of the files, this could take many hours.
But do you really have much choice about defragmenting your SQL Server files? If your I/O performance is currently adequate, then you shouldn’t bother defragmenting. But if your I/O performance is a bottleneck, then defragmenting is one inexpensive way of boosting performance, albeit a time consuming one in many cases.
One note about SANS. Some SANs include their own built-in defragmentation software. If this is the case for your SAN, then you won’t have to worry about defragmenting the data on your SAN.
Ideally, you should periodically physically defragment your SQL Server database and transaction log files. This way, you can ensure that you don’t experience any I/O performance issues because of this very common problem.
Location of the Operating System
For best performance, operating system files should be on a disk array that does not include the SQL Server data files (MDBs or LDFs). In addition, they should be located on a disk array that supports either RAID 1, 5, or 10.
Generally, I install, as most people do, the operating system on drive C: of the server. I usually configure drive C: as a RAID 1 mirrored drive for both fault tolerance and best overall performance.
In most cases, as long as you don’t locate the operating system on the same array as SQL Server data files, you have great flexibility in placing operating system files on your server.
Location of Swap File
Assuming that your SQL Server is a dedicated SQL Server, and that SQL Server memory usage has been set to dynamic (the default), the swap file won’t see a lot of activity. This is because SQL Server doesn’t normally use it a lot. Because of this, it is not critical that the swap file be located in any particular location, except you don’t want to locate it on the same array as SQL Server data files.
Generally, I place the swap file on the same array as the operating system and SQL Server executables, which I have indicated earlier, is a disk array that supports RAID 1, RAID 5, or RAID 10. This is usually drive C:. This makes administration much easier.
If your SQL Server is a shared server, running applications other than SQL Server, and paging is an issue (due to the other applications), you might want to consider moving the swap file to its own dedicated array for better performance. But better yet, make SQL Server a dedicated server.
Location of SQL Server Executables
The location of the SQL Server executables (binaries), like the location of the operating system files, is not critical, as long as they are not located on the same array as the SQL Server data files. As with operating system files, I generally place SQL Server executables on drive C:, which is generally configured as a RAID 1 mirrored drive.