One cause of excess I/O on a SQL Server is page splitting. Page splitting occurs when an index or data page becomes full, and then is split between the current page and a newly allocated page. While occasional page splitting is normal, excess page splitting can cause excessive disk I/O and contribute to slow performance.
If you want to find out if your SQL Server is experiencing a large number of page splits, monitor the SQL Server Access Methods object: Page Splits/sec. If you find out that the number of page splits is high, consider increasing the fill factor of your indexes. An increased fill factor helps to reduce page splits because there is more room in data pages before it fills up and a page split has to occur.
What is a high Page Splits/sec? There is no simple answer, as it somewhat depends on your system’s I/O subsystem. But if you are having disk I/O performance problems on a regular basis, and this counter is over 100 on a regular basis, then you might want to experiment with increasing the fill factor to see if it helps or not.
If you want to see how much physical RAM is devoted to SQL Server’s data cache, monitor the SQL Server Buffer Manager Object: Cache Size (pages). This number is presented in pages, so you will have to take this number and multiply it by 8K (8,192) to determine the amount of RAM in K that is being used.
Generally, this number should be close to the total amount of RAM in the server, less the RAM used by Windows Server, SQL Server, and any utilities you have running on the server.
If the amount of RAM devoted to the data cache is much smaller than you would expect, then you need to do some investigating to find out why. Perhaps you aren’t allowing SQL Server to dynamically allocate RAM, and instead have accidentally specified that SQL Server use less RAM that it should have access to for optimal performance. Whatever the cause, you need to find a solution, as the amount of data cache available to SQL Server can significantly affect SQL Server’s performance.
In the real world, I don’t spend much time looking at this counter, as there are other counters that do a better job of letting you know if SQL Server is memory starved or not.
To get a feel of how busy SQL Server is, monitor the SQLServer: SQL Statistics: Batch Requests/Sec counter. This counter measures the number of batch requests that SQL Server receives per second, and generally follows in step to how busy your server’s CPUs are. Generally speaking, over 1000 batch requests per second indicates a very busy SQL Server, and could mean that if you are not already experiencing a CPU bottleneck, that you may very well soon. Of course, this is a relative number, and the bigger your hardware, the more batch requests per second SQL Server can handle.
From a network bottleneck approach, a typical 100 Mbs network card is only able to handle about 3000 batch requests per second. If you have a server that is this busy, you may need to have two or more network cards, or go to a 1 Gbs network card.
Some DBAs use the SQLServer: Databases: Transaction/Sec: _Total to measure total SQL Server activity, but this is not a good idea. Transaction/Sec only measures activity that is inside a transaction, not all activity, producing skewed results. Instead, always use the SQLServer: SQL Statistics: Batch Requests/Sec counter, which measures all SQL Server activity.
SQL compilations of Transact-SQL code is a normal part of SQL Server’s operation. But because compilations chew up CPU and other resources, SQL attempts to reuse as many execution plans in cache as possible (execution plans are created when compilations occur). The more execution plans are reused, the less overhead there is on the server, and the faster overall performance there is.
To find out how many compilations SQL Server is doing, you can monitor the SQLServer: SQL Statistics: SQL Compilations/Sec counter. As you would expect, this measures how many compilations are performed by SQL Server per second.
Generally speaking, if this figure is over 100 compilations per second, then you may be experiencing unnecessary compilation overhead. A high number such as this might indicate that you server is just very busy, or it could mean that unnecessary compilations are being performed. For example, compilations can be forced by SQL Server if object schema changes, if previously parallelized execution plans have to run serially, if statistics are recomputed, or if a number of other things occur. In some cases, you have the power to reduce the number of unnecessary compilations. See this page for tips on how to do this.
If you find that your server is performing over 100 compilations per second, you should take the time to investigate if the cause of this is something that you can control. Too many compilations will hurt your SQL Server’s performance.
The SQLServer: Databases: Log Flushes/sec counter measures the number of log flushes per second. This can be measured on a per database level, or for all databases on a SQL Server.
So exactly what is a log flush? The best way to describe it is to provide an example. Let’s say that you want to start a transaction that has 10 INSERTs in it. When the transaction begins, and the first INSERT is made, and new data is inserted into data pages, essentially two things happen at the same time. The data page in the buffer cache is updated with the newly INSERTed row of data, and the appropriate data for the log file is written to the log cache for this single INSERT. This continues to happen until the transaction is complete. At this time, the data for this transaction from the log cache is immediately written to the log file, but the data in the buffer cache stays there until the next checkpoint process runs, which at that time the database is updated with the newly INSERTed rows.
You may have never heard of the log cache, which is a place in memory that SQL Server records data to be written to the log file. The purpose of the log cache is very important, as it is used to roll back a transaction before it is committed, if the circumstances call for it. But once a transaction is complete (and no longer can be rolled back), this log cache is immediately flushed to the physical log file. This is a normal procedure. Keep in mind that SELECT queries that don’t modify data don’t create transactions and don’t produce log flushes.
Essentially, a log flush occurs when data is written from the log cache to the physical log file. So in essence, a log flush occurs every time after a transaction is complete, and the number of log flushes that occur are related to the number of transactions performed by SQL Server. And as you might expect, the size of a log flush (how much data is written from the log cache to disk) varies depending on the transaction. So how can this information help us?
Let’s say that we know we have a disk I/O bottleneck, but we are not sure what is causing it. One way to trouble-shoot the disk I/O bottleneck is to capture the Log Flushes/sec counter data and see how busy this mechanism is. As you might expect, if your server experiences lots of transactions, it will also experience a lot of log flushes, so the value you see for this counter can vary from server to server, depending on how busy it is with action-type queries that create transactions. What you want to do with this information is to try to identify situations where the number of log flushes per seconds seems to be significantly higher than the expected number of transactions that you think should be running on a server.
For example, let’s say that you have a daily process that INSERTs 1,000,000 rows into a table. There are several different ways that these rows could be inserted. First, each row could be inserted separately, each INSERT wrapped inside a single transaction. Second, all of the INSERTS could be performed within a single transaction. And last, the INSERTs might be divided into multiple transactions, somewhere between 1 and 1,000,000. Each of these options is different and has a significantly different effect on SQL Server, and the number of log flushes per second. In addition, it’s easy to make a mistake and assume that this process you are running is a single transaction, even though it might not be. Most people tend to think of a single process as a single transaction.
In the first case, if 1,000,000 rows are INSERTed with 1,000,000 transactions, there will also be 1,000,000 log flushes. But in the second case, 1,000,000 rows will be inserted within a single transaction, and there will only be one log flush. And in the third case, the number of log flushes will equal the number of transactions. Obviously, the size of the log flush will be bigger with 1,000,000 transactions than with 1 transaction, but for the most part, this is not important from a performance standpoint as described here.
So which option is best? In all cases, you will still be producing a lot of disk I/O. There is no way to get around this if you deal with 1,000,000 rows. But by using one or just a few transactions, you reduce the number of log flushes significantly, and disk I/O is reduced significantly, which helps to reduce the I/O bottleneck, boosting performance.
So we have learned two key things here. First, that you want to reduce log flushes as much as you can, and one key way to do this is to reduce the number of transaction occurring on your server.
Since the number of users using SQL Server affects its performance, you may want to keep an eye on the SQL Server General Statistics Object: User Connections. This shows the number of user connections, not the number of users, that are currently connected to SQL Server.
When interpreting this number, keep in mind that a single user can have multiple connections open, and also that multiple people can share a single user connection. Don’t make the assumption that this number represents actual users. Instead, use it as a relative measure of how “busy” the server is. Watch the number over time to get a feel if your server is being used more, or being used less.
If your databases are suffering from deadlocks, you can track then by using the SQL Server Locks Object: Number of Deadlocks/sec. But unless this number is relatively high, you want see much here because the measure is by second, and it takes quite a few deadlocks to be noticeable.
But still, it is worth checking out if you are having a deadlock problem. Better yet, use the Profiler’s ability to track deadlocks. It will provide you with more detailed information. What you might consider doing is to use the Number of Deadlocks/sec counter on a regular basis to get the “big” picture, and if you discover deadlock problems with this counter, then use the Profiler to “drill” down on the problem for a more detailed analysis.