How to Correlate a SQL Server 2005 Profiler Trace With System Monitor Performance Counter Data

Ending the Traces

At this point, both System Monitor and Profiler are collecting data. Collect the data for as long as it is appropriate to meet your needs. When you have collected all the data you want, end both System Monitor and Profiler, in any order, it doesn’t matter.

At this point you now have two log files: a System Monitor Counter Log and a Profiler Trace Log. I would also suggest that you exit both System Monitor and Profiler at this time. Doing so now can save some frustrations a little later.

How to Correlate System Monitor and Profiler Data

Now for the fun part. This is where you get to correlate the data in both of the logs you just collected, seeing how they relate to each other.

To perform the data correlation, start Profiler and load the Trace Log file that you just created. Next, click on “File” and then select “Import Performance Data.” The following dialog box appears. Here, you must select the System Monitor Counter Log file you created earlier.

Once you have selected the correct log, select “Open,” and the following dialog box appears.

In the above dialog box, you will see all the counters available to you. The only counters you will see will be the ones in the log you just opened. Select which counters you want to see correlated with Profiler event data then click “OK.” The following window appears.

In the above window, you now see a correlation between Profiler Events (at the top of the dialog box) and System Monitor Counter data in the graph (in the middle). Depending on how many events and counters you are viewing, the dialog box above can become very busy. The larger your screen, and the fewer the events and counters displayed, the easier the data will be to view and analyze.

You can view the data two different ways. First, you can click anywhere on the graph and a red line will appear. This indicates the counter activity at a specific point in time. And then at the top, you will see an event highlighted that occurred closest to the time the counter data was collected. This option allows you to see what event (or events) was occurring at a specific point in time. You can then see what query or other activity was going on during this time and how it was affecting server resources.

Second, you can click on any of the Profiler events at the top of the dialog box, and then the red line in the graph will move to the time that particular event took place. This way, you can see how a particular event affected server resources.

For example, in the figure below, notice that I have clicked on an event at the top of the window that had a duration of 3364934 milliseconds. Notice where the red line is. Here, you can see that this particular event, which was very long, was the likely cause that the % Processor Time ran about 60% or so for a very long period.

Fully Understanding What You Are Seeing

After you begin playing around with this correlation window, you may discover some oddities. For example, above, how do I know that this long running query was the cause for the CPU running for 60% for such a long time? Actually, I don’t. It is an educated guess. This tool does not have the ability to granularly determine exactly how much server resources were used up by a particular query. But you can make some educated guesses. In my case, I sorted all of the queries captured by Profiler by duration. By doing this, I found out that one of the queries ran for 3364934 milliseconds. Most of the others did not run that long. In the graph, I can see that the % CPU Utilization was high for about this same period of time. Thus, I can conclude that this particular query was the main cause of this excess CPU utilization. Especially when I can see that the CPU utilization went down to almost zero after the query finished executing.

You will also want keep in mind that multiple queries will probably be running at the same time, and that there may not be an exact time correlation between counters and events, as counters are measured over an interval of time you specify, and while events occur at discreet points in time. If you keep all of this in mind, and do a little practicing with the data, I think you will come to find that this correlation capability is very useful indeed. Now, finally, I can fairly easily determine what query (or queries) is causing my server to get stressed out.

]]>

Leave a comment

Your email address will not be published.