Memory Performance Counters
This section assumes that your server is dedicated to SQL Server, and perhaps some related server utilities. If not, and you are having memory-related performance problems on your SQL Server, then your first step should be to move all non-SQL Server-related programs off of the physical server running SQL Server. Once you have done that, then use the following information to look for memory-related bottlenecks.
One of the key counters you should be regularly watching is the Memory Object: Pages/Sec. This measures the number of pages per second that are paged out of memory to disk, or paged into memory from disk. Assuming that SQL Server is the only major application running on your server, then this figure should average nearly zero, except for occasional spikes, which are normal.
Over continuous periods of time (10 minutes or so) the Pages/Sec should ideally be near zero (you will see some activity). If this is not the case, this means that NT Server is having to page data, which it should not be doing. This is because SQL Server does not use NT Server’s page file, and since this should be the only application on your server, there should be little paging going on.
If there is regular paging going on, this means that you are running other applications on your server, which is causing NT Server to page, or you have set the SQL Server Max Server Memory configuration setting to some other setting other than “Dynamically configure SQL Server memory”. Determine which is the problem and fix it, as this paging is slowing down SQL Server’s performance. Ideally, remove the NT applications causing the paging.
If you have changed the SQL Server Max Server Memory configuration to some other value other than “Dynamically configure SQL Server memory”, then change it back to this setting. SQL Server should be allowed to take as much RAM as it wants for its own use without having to compete for RAM with other applications.
Another way to double-check to see if your SQL Server has enough physical RAM is to check the Memory Object: Available Bytes counter. This counter can be viewed from Performance Monitor or from the NT Server or Windows 2000 Task Manager (see the Performance tab). This value should be greater than 5MB. If not, then your SQL Server needs more physical RAM. On a server dedicated to SQL Server, SQL Server attempts to maintain from 4-10MB of free physical memory. The remaining physical RAM is used by the operating system and SQL Server. When the amount of available bytes is less than 4MB, most likely SQL Server is also paging (which it shouldn’t) and is experiencing a performance hit.
If your SQL Server is experiencing memory bottlenecks, consider these possible solutions:
-
Add more physical RAM. If you are already at 2GB of physical RAM and are still having a memory problem, consider upgrading to SQL Server 7 Enterprise Edition (which supports up to 3GB), or upgrading to SQL 2000 Enterprise Edition, which in theory can support up to 64GB of physical RAM.
-
Ensure that SQL Server is the only application running on the server (other than server management utilities).
-
Remove or disable all unnecessary services.
-
Ensure SQL Server is running as a member server (not a domain controller).
-
Configure SQL Server to allocate memory dynamically (the default), don’t hard code how much RAM SQL Server is allocated.
Network Performance Counters
One of the best ways to monitor if you have a network bottleneck is to watch the Network Interface Object: Bytes Total/Sec counter. This counter measures the number of bytes that are being sent back and forth between your server and the network. This includes both SQL Server and non-SQL Server network traffic. Assuming your server is a dedicated SQL Server, then the vast majority of the traffic measured by this counter should be from SQL Server.
There is no hard and fast “correct” number for this counter as it measures the actual traffic. To help you decide if your server has a network bottleneck, one way to use this number is to compare it with the maximum traffic supported by the network connection your server is using. Also, this is another important counter to watch over time. It is important to know if your network traffic is increasing regularly. If it is, then you can use this information to help you plan for future hardware needs.
If your SQL Server is experiencing network bottlenecks, consider these possible solutions:
-
Add faster network cards.
-
Add additional network cards.
-
Server network card should be attached to switches.
-
Network cards should be running in full duplex mode.
-
Tune your application so that it does not require unnecessary network trips. Do this by returning only the required data and used stored procedures.
-
Remove all unnecessary network protocols from the server.
-
Use TCP/IP as the network library on the client and server.
Before you can use the network performance counters, the Network Monitor Agent service must be installed on your server. After installing it, you will have to reboot. Also, don’t forget to rerun the latest NT service pack to update the files added during the installation process.
SQL Server Performance Counters
Up to this point we have focused on identifying and resolving key hardware-related bottlenecks. In this section, we will take a look of some of the Performance Monitor counters you can use to identify specific performance issues in SQL Server.
One of the key SQL Server Performance Monitor counters to watch is the SQL Server Buffer Manager Object: Buffer Cache Hit Ratio. This indicates how often SQL Server goes to the buffer, not the hard disk, to get data. In OLTP applications, this ratio should exceed 95% on a regular basis. If it doesn’t, then you should consider adding more RAM to your server to increase performance. In OLAP applications, the hit ratio may be much less because of the nature of how OLAP works. In any case, more physical RAM should increase the performance of SQL Server running either OLTP or OLAP applications.
If you want to see how much physical RAM is devoted to SQL Server’s data cache, monitor the SQL Server Buffer Manager Object: Cache Size (pages). This number is presented in pages, so you will have to take this number and multiply it by 8K (8,192) to determine the amount of RAM in K that is being used. Generally, this number should almost come close to the total amount of RAM in your computer, assuming you are devoting your server to SQL Server. This number should be close to the total amount of RAM in the server, less the RAM used by NT, SQL Server, and any utilities you have running on the server. If the amount of RAM devoted to the data cache is much smaller than you would expect, then you need to do some investigating to find out why. Perhaps you aren’t allowing SQL Server to dynamically allocate RAM. Whatever the cause, you need to find a solution, as the amount of data cache available to SQL Server can significantly affect SQL Server’s performance.
Since the number of users using SQL Server affects its performance, you may want to keep an eye on the SQL Server General Statistics Object: User Connections. This shows the number of user connections, not the number of users, that currently are connected to SQL Server. When interpreting this number, keep in mind that a single user can have multiple connections open, and also that multiple people can share a single user connection. Don’t make the assumption that this number represents actual users. Instead, use it as a relative measure of how “used” the server is. Watch the number over time to get a feel if your server is being more used, or less used. If usage is going up, then you can use this information to help you better plan future hardware needs.
One cause of excess I/O on a SQL Server is page splitting. Page splitting occurs when an index or data page becomes full, and then is split between the current page and a newly allocated page. While occasional page splitting is normal, excess page splitting can cause performance issues. To find out if you are experiencing a large number of page splits, monitor the SQL Server Access Methods Object: Page Splits/sec. Unfortunately, I don’t have a good figure to recommend as to what this maximum number should be for this counter. The goal should be to keep it as low as possible. What you may want to do is watch it over a period of time. If it is increasing, this may indicate that you need to rebuild the indexes on the tables in your databases, and you may want to consider increasing the fillfactor on the indexes when you rebuild them.
If your users are complaining that they have to wait for their transactions to complete, you may want to find out if object locking on the server is contributing to this problem. To do this, use the SQL Server Locks Object: Average Wait Time (ms). You can use this counter to measure the average wait time of a variety of locks, including: database, extent, Key, Page, RID, and table. If you can identify one or more types of locks causing transaction delays, then you will want to investigate further to see if you can identify what specific transactions are causing the locking. The Profiler is the best tool for this detailed analysis.
While table scans are a fact of life, and sometimes faster than index seeks, generally it is better to have fewer table scans than more. To find out how many table scans your server is performing, use the SQL Server Access Methods Object: Full Scans/sec. Note that this counter is for an entire server, not just a single database. One thing you will notice with this counter is that there often appears to a pattern of scans occurring periodically. In many cases, these are table scans SQL Server is performing on a regular basis for internal use. What you want to look for are the random table scans that represent your application. If you see what you consider to be an inordinate number of table scans, then break out the Profiler and Index Tuning Wizard to help you determine exactly what is causing them, and if adding any indexes can help reduce the table scans. Of course, SQL may just be doing its job well, and performing table scans instead of using indexes because it is just plain more efficient.
If you suspect that your backup or restore operations are running at sub-optimal speeds, you can help verify this by using the SQL Server Backup Device Object: Device Throughput Bytes/sec. This counter will give you a good feel for how fast your backups are performing. You will also want to use the Physical Disk Object: Avg. Disk Queue Length counter to help collaborate your suspicions. Most likely, if your are having backup or restore performance issues, it is because of an I/O bottleneck.
Removing Bottlenecks is Only One Part of Performance Tuning
When many people think about performance tuning, they only think about how to remove hardware-related bottlenecks. What they don’t know is that most SQL Server-related performance problems are not hardware related, but are mostly caused by poor database and application design. Because of this, they are often disappointed that they can’t figure their performance issues by throwing hardware at the problem. Removing hardware bottlenecks is just one part of the much larger subject of SQL Server performance tuning and optimization.
]]>