SQL Server Performance: Query Tuning vs. Process Tuning

In the different projects where I perform a performance audit or am involved in performance tuning, I often find that while queries and stored procedures function correctly, they are not efficient in processing larger data sets. Many OLTP databases are set up as OLAP databases to handle single records instead of a record set as a whole.

In tuning a query with proper WHERE clauses, indexes, etc., you can often achieve a performance gain. However, when tuning the complete process, or handling record sets as a whole, the performance gain can be many times greater with this tuning.

Optimize for Changes or Selections

Tuning is a process of getting the optimal speed when working through the mutations using the least amount of recourses possible, while still keeping optimal performance in selections of data. Adding an index to a table will speed up selections, but slow down any mutations (inserts, updates, or deletes) in the table. The choice in defining the optimal balance between these two depends not only on the queries being executed on the database, but also on their frequency and priority.

If, for instance, all mutations are handled in off-hours, the tuning is often best if these can be completed within these hours, but the maximum performance is set up for the selections during office hours. During these hours, people are waiting for results of a selection.

Use of Resources

The performance is determined by the limitations of the available resources. The specific hardware available, type of resource needed to perform the requested query and the concurrent use of the server and its resources determine the time needed. Often one of the resources determines the major part of the query cost. When performing on the fly calculations, the processor is a key issue. When the amount of data increases, memory and disk I/O are a large influence. When tuning, the biggest gain can be reached by addressing these resources first. The query execution plan gives insight into the use of the resources.

Query Tuning

In query tuning, the main focus is to execute the existing queries with the best performance possible. By creating indexes, retrieving only the necessary columns and rows with correct where clauses, using indexed views, using pre-calculated values, spreading tables over multiple disks, etc., a given query’s speed can be increased tremendously. However, there is a limit to the extent this can be achieved. After this, extra resources like more memory or faster disks can be added.

Process Tuning

By altering the complete process of handling the dataflow, the target is to use as few connections as possible, and limiting the number of query executions and complexity. This may require the data model to be altered. Because the process is always very specific to the given situation and often influences many aspects of the database, there are no general guidelines to lead through this tuning. However, by identifying the area where the largest amount of time or resources are used in handling the data, a critical look at the current situation can lead to new methods. Most of the time, there is more than one method of handling the dataflow.

Below are some illustrative examples.

Connections and Executions

Making a connection to a database takes time. Executing even a simple query takes time to compile and execute. This overhead is partly dependant on the table and its content, but always takes some time. For instance, the following code creates a table with one field. Without inserting any data, I query the table repeatedly and note the time:

CREATE TABLE tbl_Test (TestID CHAR(1))
DECLARE @Time1 DATETIME, @Time2 DATETIME, @i INT

SELECT @Time1 = GETDATE(), @i=0

AGAIN:
SELECT * FROM tbl_Test
SELECT @Time2 = GETDATE(), @i=@i+1
PRINT ‘TIME IS ‘ + CONVERT(CHAR, @Time2, 14) + ‘, i = ‘ + CONVERT(CHAR, @i) + ‘, TIMEDIFF = ‘ + CONVERT (CHAR, DATEDIFF(ms, @Time1, @Time2))

SELECT * FROM tbl_Test
SELECT @Time1 = GETDATE(), @i=@i+1
PRINT ‘TIME IS ‘ + CONVERT(CHAR, @Time2, 14) + ‘, i = ‘ + CONVERT(CHAR, @i) + ‘, TIMEDIFF = ‘ + CONVERT (CHAR, DATEDIFF(ms, @Time2, @Time1))

IF @i < 1000 GOTO AGAIN

This will produce 1000 empty selections. The messages tell me the time difference between the previous selection and the current one. The first 160 or so selections are executed within the same millisecond. However, even in this small selection, after 160 selections there is some overhead that can be measured.

TIME IS 14:54:29:430, i = 158, TIMEDIFF = 0
(0 row(s) affected)
TIME IS 14:54:29:430, i = 159, TIMEDIFF = 0
(0 row(s) affected)
TIME IS 14:54:29:430, i = 160, TIMEDIFF = 16
(0 row(s) affected)
TIME IS 14:54:29:447, i = 161, TIMEDIFF = 0
(0 row(s) affected)
TIME IS 14:54:29:447, i = 162, TIMEDIFF = 0

As the table definitions grow more complex and the number of records in them increases, this will occur faster and with greater time loss.

Continues…

Leave a comment

Your email address will not be published.