Our SQL Server is experiencing lots of blocking, but we don’t have any resource bottlenecks.
I am the DBA for a SQL Server that is running a two-node active/passive cluster with four CPU’s and 3GB or RAM on each.
The server keeps up with demand most of the time. But once in a while we experience blocking issues and new connections are timing out–not a good thing. Our system monitor shows that we are not using up more than 1GB of RAM and we are hardly swapping. Our CPU’s are hitting 90 percent, but only occasionally, the average is about 30-40 percent. The buffer hit ratio is high in the 90′s.
All this dumfounds me. Any idea on how to diagnose the cause of the blocking?
Blocking and time out issues are generally (but not always) related to poor application design. They cannot be solved by having a powerful server, as you have already found out.
You may or may not be able to resolve this issue, depending on who is controlling the code. If it the code has been written in-house, then you have a chance at resolving it. If the code has been written be a third party, you may be out of luck, unless the vendor is interested in listening to your feedback (good luck at this).
One tool you can use to help you determine the cause of the blocking is to run a trace using Profiler. It allows you to observe the communications between your application and SQL Server, helping you to diagnose the cause of the blocking. This is not a particularly easy task, but it can be done.
Blocking occurs when a single connection from an application holds onto a lock, preventing a second connection for acquiring a conflicting lock type. When this happens, the second connection has to wait until the first lock is removed. If the wait time it long enough, the waiting connection can time out. It is not unusual for a connection to get a lock then hold it for an unusually long time, causing not just one other connection to be blocked, but many other connections, causing a cascading effect, which can be disastrous to your application’s performance.
Some common causes of blocking include, but are not limited to:
- Queries with long execution times (very common)
- Lack of appropriate indexes, causing queries to run long
- Canceling queries that were not committed or rolled back as the should be
- Applications that don’t process all the results to final completion
- Distributed client/server deadlocks
To help resolve blocking issues by reducing locks in your applications, see these two tips pages on this website:
Until the blocking problem can be fixed in the application, your best work-around (although not ideal) to this problem is to kill the process that is causing the block (assuming that it does not go away on its own in a reasonable amount of time). Blocking processes can be viewed and killed through the Enterprise Manager or Management Studio. Before you kill the blocking process, make a note of the activity going on so that you can use this information to help you better pinpoint the root cause of the problem.