Disk Fragmentation Level
If you create a new database on a brand new disk array, the database file and transaction log file created will be one contiguous file. But if your database or transaction log grows in size (and what database and transaction log doesn’t), it is possible for the files to become fragmented over time. File fragmentation, which scatters pieces of your files all over a disk array, causes your disk array to work harder to read or write data, hurting disk I/O performance.
As part of your performance audit, you need to find out how defragmented your SQL Server database and transaction logs are. If you have Windows 2000 or 2003, you can use the built-in defragmentation utility to run a fragmentation analysis to see how badly the files are fragmented. If you are running Windows NT Server 4.0, then you will have to use a third party utility, such as Diskeeper from Executive Software, to perform the analysis.
If the analysis recommends that you defragment, you should. Unfortunately, defragmenting a SQL Server’s database and transaction log files is not always an easy task. Open files, such as those database and transaction log files found on a running SQL Server, cannot always be defragmented. For example, the built-in defragmentation utility cannot defrag SQL Server MDF and LDF files, but Diskeeper 8.0 can in many cases, but not all. This means, than is some cases, you may have to bring SQL Server offline in order to defrag MDF and LDF files. And depending on how fragmented the files are, and the size of the files, this could take many hours.
But do you really have much choice about defragmenting your SQL Server files? If your I/O performance is currently adequate, then you shouldn’t bother defragmenting. But if your I/O performance is a bottleneck, then defragmenting is one inexpensive way of boosting performance, albeit a time consuming one in many cases.
Ideally, you should periodically defragment your SQL Server database and transaction log files. This way, you can ensure that you don’t experience any I/O performance issues because of this very common problem.
Location of the Operating System
For best performance, operating system files should be on a disk array that does not include the SQL Server data files (MDBs or LDFs). In addition, they should be located on a disk array that supports either RAID 1, 5, or 10.
Generally, I install, as most people do, the operating system on drive C: of the server. I usually configure drive C: as a RAID 1 mirrored drive for both fault tolerance and best overall performance.
In most cases, as long as you don’t locate the operating system on the same array as SQL Server data files, you have great flexibility in placing operating system files on your server.
Location of SQL Server Executables
The location of the SQL Server executables (binaries), like the location of the operating system files, are not critical, as long as they are not located on the same array as the SQL Server data files. As with operating system files, I generally place SQL Server executables on drive C:, which is generally configured as a RAID 1 mirrored drive.
If you are building a SQL Server 7.0 cluster, then the SQL Server executables cannot be located on drive C:, but instead must be located on a shared array. Unfortunately, this is often the same array that you store the SQL Server data files, unless you have a lot of money to spend on a separate array just for the executables. While performance is somewhat hindered by locating the executables on the same shared array as the data files, it is not too bad a compromise, given the fault tolerance you are getting in return. On the other hand, this is a good reason to upgrade to SQL Server 2000 clustering. If you are building a SQL Server 2000 cluster, then the SQL Server executables have to be located on local drives, not the shared array, so performance is not an issue.
Location of Swap File
Assuming that your SQL Server is a dedicated SQL Server, and that SQL Server memory usage has been set to dynamic (the default), the swap file won’t see a lot of activity. This is because SQL Server doesn’t normally use it a lot. Because of this, it is not critical that the swap file be located in any particular location, except you don’t want to locate it on the same array as SQL Server data files.
Generally, I place the swap file on the same array as the operating system and SQL Server executables, which I have indicated earlier, is a disk array that supports RAID 1, RAID 5, or RAID 10. This is usually drive C:. This makes administration much easier.
If your SQL Server is a shared server, running applications other than SQL Server, and paging is an issue (due to the other applications), you might want to consider moving the swap file to its own dedicated array for better performance. But better yet, make SQL Server a dedicated server.
Location of the tempdb Database
If your tempdb database is heavily used, consider moving it to an array of its own, either RAID 1 or RAID 10, to boost disk I/O performance. Avoid RAID 5 arrays as they can be slow when writing data, a common side-effect of using tempdb. If you can’t locate the tempdb on its own array, and you want to avoid locating it on the same array as your database files, consider locating it on the same drive as the operating system. This will help to reduce overall I/O contention and boost performance.
If your application uses the tempdb database a lot, and causes it to grow larger than its default size, you may want to permanently increase the default size of the tempdb file to a size closer to what is actually used by your application on a day-to-day basis. This is because every time the SQL Server service (mssqlserver) is restarted, the tempdb file is recreated to the default size. While the tempdb file can grow, it does take some resources to perform this task. By having the tempdb file at the correct size when SQL Server is restarted, you don’t have to worry about the overhead of it growing during production.
In addition, heavy activity in the tempdb database can drag down your application’s performance. This is especially true if you create one or more large temp tables and then are querying or joining them. To help speed these queries, be sure the AUTOSTATS database option is turned on for tempdb, and then create one or more indexes on these temp tables that can be used by your query. In many cases, you will find that this can substantially speed up your application. But like many performance tips, be sure you test this one to see if it actually helps in your particular situation.
Location of System Databases
The system databases (master, msdb, model) don’t experience a lot of read and write activity, so locating them on the same array as your SQL Server data files is generally not a performance issue. The only exception might be for very large databases with hundreds or thousands of users. In this case, putting them on their own array can help boost overall I/O performance somewhat.
Location of User Databases
For best performance, user database files (MDBs) should be located on their own array (RAID 1, 5, or 10), separate from all other data files, including log files. If you have multiple large databases on the same SQL Server, consider locating each separate database file(s) on its own array for less I/O contention.
Location of Log Files
Ideally, each log file should reside on its own separate array (RAID 1 or 10, RAID 5 will slow down transaction log writes more than you would like). The reason for this is because most of the time, transaction logs experience sequential writes, and if the array can write the data sequentially (not having to interrupt itself to perform other reads and writes), then sequential writes are very fast. But if the array can’t write sequentially because it has to random perform other reads and writes, sequential writes can’t be performed, and performance suffers.
Of course, having a separate array for each log file is expensive, and often can’t be cost justified. At the very least though, locate all log files on an array (RAID 1 or RAID 10) other than the array used for database files. While sequential write performance won’t be as good as if each log file had its own array, it is still much better than trying to contend for disk I/O with data files.
Number of Disk Controllers in Server
A single disk controller, whether is it is SCSI or fibre, has a maximum limit on its throughput. Because of this, you will want to match the number of disk controllers to the amount of data throughput you expect. As each controller is do different, I can’t recommend specific solutions, other than to say that at a very minimum, you will want two disk controllers. One controller should be used for non-hard disk devices, such as the CD-ROM, backup devices, and so on. And the other controller would be used for hard disk. The goal is not to attach both slow and fast devices on the same controller.
Quite often, you see this scenario, which is a good one. One controller is for non-hard disk devices, one controller is used for a RAID 1 local hard disk, and a third (and sometimes more) is used for arrays that hold SQL Server database files and logs. Be sure you don’t attach more drives to a controller than it can handle. While it may work, performance will suffer.
Type of Disk Controllers in Server
Always purchase the fastest disk controller you can afford, assuming you want the best SQL Server performance. As you may know, different disk controllers have different performance characteristics. For example, there are different types of SCSI, such as Wide SCSI, Narrow SCSI, Ultra SCSI, and so on. The same is true, although to a less degree, of fibre connections.
Because of the wide variety of controllers, I can’t recommend any specific ones. Generally, a hardware vendor will offer several models to choose from. Ask about the performance benefits of each one, and get the one that offers the best throughput.