SQL Server Transact-SQL WHERE
This tip may sound obvious to most of you, but I have seen professional developers, in two major SQL Server-based applications used worldwide, not follow it. And that is to always include a WHERE clause in your SELECT statement to narrow the number of rows returned. If you don’t use a WHERE clause, then SQL Server will perform a scan of your table/index and return all of the rows. In some cases you may want to return all rows, and not using a WHERE clause is appropriate in this cases. But if you don’t need all the rows returned, use a WHERE clause to limit the number of rows returned.
By returning data you don’t need, you are causing SQL Server to perform I/O it doesn’t need to perform, wasting SQL Server resources. In addition, it increases network traffic, which can also lead to reduced performance. And if the table is very large, a scan can lock the table, preventing other users from accessing it, hurting concurrency.
Another negative aspect of a scan is that it will tend to flush out data pages from the cache with often useless data, which reduces SQL Server’s ability to reuse useful data in the cache, which increases disk I/O and hurts performance. [2000, 2005, 2008] Updated 2-4-2009
In a WHERE clause, the various operators used can affect how fast a query is run. This is because some operators lend themselves to speed over other operators. Of course, you may not have any choice of which operator you use in your WHERE clauses, but sometimes you do.
Here are the key operators used in the WHERE clause, ordered by their performance. Those operators at the top will produce results faster than those listed at the bottom.
>, >=, <, <=
This lesson here is to use = as much as possible, and <> as least as possible. [2000, 2005, 2008] Updated 2-4-2009
In a WHERE clause, the various operands used directly affect how fast a query is run. This is because some operands lend themselves to speed over other operands. Of course, you may not have any choice of which operand you use in your WHERE clauses, but sometimes you do.
Here are the key operands used in the WHERE clause, ordered by their performance. Those operands at the top will produce results faster than those listed at the bottom.
- A single literal used by itself on one side of an operator.
- A single column name used by itself on one side of an operator, a single parameter used by itself on one side of an operator.
- A multi-operand expression on one side of an operator.
- A single exact number on one side of an operator.
- Other numeric number (other than exact), date, and time.
- Character data, NULLs.
The simpler the operand, and using exact numbers, provides the best overall performance. [2000, 2005, 2008] Updated 2-4-2009
If a WHERE clause includes multiple expressions, there is generally no performance benefit gained by ordering the various expressions in a particular order. This is because the SQL Server Query Optimizer automatically orders them (behind the scenes) for optimal performance. There are a few exceptions to this, which are discussed on this web site. [2000, 2005, 2008] Updated 2-4-2009
By default, some developers, especially those who have not worked with SQL Server before, routinely include code similar to this in their WHERE clauses when they make string comparisons:
SELECT column_name FROM table_name
WHERE LOWER(column_name) = ‘name’
In other words, these developers are making the assuming that the data in SQL Server is case-sensitive, which it generally is not. If your SQL Server database is not configured to be case sensitive, you don’t need to use LOWER or UPPER to force the case of text to be equal for a comparison to be performed. Just leave these functions out of your code. This will speed up the performance of your query, as any use of text functions in a WHERE clause hurts performance.
But what if your database has been configured to be case-sensitive? Should you then use the LOWER and UPPER functions to ensure that comparisons are properly compared? No. The above example is still poor coding. If you have to deal with ensuring case is consistent for proper comparisons, use the technique described below, along with appropriate indexes on the column in question:
SELECT column_name FROM table_name
WHERE column_name = ‘NAME’ or column_name = ‘name’
This code will run much faster than the first example. [2000, 2005, 2008] Updated 2-4-2009
Try to avoid WHERE clauses that are non-sargable. The term “sargable” (which is in effect a made-up word) comes from the pseudo-acronym “SARG”, which stands for “Search ARGument,” which refers to a WHERE clause that compares a column to a constant value. If a WHERE clause is sargable, this means that it can take advantage of a useful index (assuming one is available) to speed completion of the query. If a WHERE clause is non-sargable, this means that the WHERE clause (or at least part of it) cannot take advantage of an index, instead performing a table/index scan, which may cause the query’s performance to suffer.
Non-sargable search arguments in the WHERE clause, such as “IS NULL”, “<>”, “!=”, “!>”, “!<”, “NOT”, “NOT EXISTS”, “NOT IN”, “NOT LIKE”, and “LIKE ‘%500′” generally prevents (but not always) the query optimizer from using a useful index to perform a search. In addition, expressions that include a function on a column, expressions that have the same column on both sides of the operator, or comparisons against a column (not a constant), are not sargable.
But not every WHERE clause that has a non-sargable expression in it is doomed to a table/index scan. If the WHERE clause includes both sargable and non-sargable clauses, then at least the sargable clauses can use a useful index (if one exists) to help access the data quickly.
In many cases, if there is a covering index on the table, which includes all of the columns in the SELECT, JOIN, and WHERE clauses in a query, then the covering index can be used instead of a table/index scan to return a query’s data, even if it has a non-sargable WHERE clause. But keep in mind that covering indexes have their own drawbacks, such as potentially producing very wide indexes that increase disk I/O when they are read.
In some cases, it may be possible to rewrite a non-sargable WHERE clause into one that is sargable. For example, the clause:
WHERE SUBSTRING(firstname,1,1) = ‘m’
Can be rewritten like this:
WHERE firstname like ‘m%’
Both of these WHERE clauses produce the same result, but the first one is non-sargable (it uses a function) and will run slower, while the second one is sargable, and will run much faster.
WHERE clauses that perform some function on a column are non-sargable. On the other hand, if you can rewrite the WHERE clause so that the column and function are separate, then the query can use an available useful index, greatly boosting performance. For example:
Function Acts Directly on Column, and Index Cannot Be Used:
SELECT member_number, first_name, last_name
WHERE DATEDIFF(yy,datofbirth,GETDATE()) > 21
Function Has Been Separated From Column, and an Index Can Be Used:
SELECT member_number, first_name, last_name
WHERE dateofbirth < DATEADD(yy,-21,GETDATE())
Each of the above queries produces the same results, but the second query will use an available, useful index because the function is not performed directly on the column, as it is in the first example. The moral of this story is to try to rewrite WHERE clauses that have functions so that the function does not act directly on the column.
WHERE clauses that use NOT are not sargable, but can often be rewritten to remove the NOT from the WHERE clause, for example:
WHERE NOT column_name > 5
WHERE column_name <= 5
Each of the above clauses produces the same results, but the second one is sargable.
If you don’t know if a particular WHERE clause is sargable or non-sargable, check out the query’s execution plan in Query Analyzer or Management Studio. Doing this, you can very quickly see if the query uses available indexes as you expect.
With some careful analysis, and some clever thought, many non-sargable queries can be written so that they are sargable. Your goal for best performance (assuming it is possible) is to get the left side of a search condition to be a single column name, and the right side an easy to look up value. [2000, 2005, 2008] Updated 2-4-2009