SQL Server Transact-SQL SELECT Statement

When using the UNION statement, keep in mind that, by default, it performs the equivalent of a SELECT DISTINCT on the final recordset. In other words, UNION takes the results of two like recordsets, combines them, and then performs a SELECT DISTINCT in order to eliminate any duplicate rows. This process occurs even if there are no duplicate records in the final recordset. If you know that there are duplicate records, and this presents a problem for your application, then by all means use the UNION statement to eliminate the duplicate rows.

On the other hand, if you know that there will never be any duplicate rows, or if there are, and this presents no problem to your application, then you should use the UNION ALL statement instead of the UNION statement. The advantage of UNION ALL is that is does not perform the SELECT DISTINCT function, which saves a lot of unnecessary SQL Server resources from being used. [2000, 2005, 2008] Updated 2-3-2009

*****

Sometimes you might want to merge two or more sets of data resulting from two or more queries using UNION. For example:

SELECT column_name1, column_name2

FROM table_name1

WHERE column_name1 = some_value

UNION

SELECT column_name1, column_name2

FROM table_name1

WHERE column_name2 = some_value

This same query can be rewritten, like the following example, and when doing so, performance will be boosted:

SELECT DISTINCT column_name1, column_name2

FROM table_name1

WHERE column_name1 = some_value OR column_name2 = some_value

And if you can assume that neither criteria will return duplicate rows, you can even further boost the performance of this query by removing the DISTINCT clause. [2000, 2005, 2008] Updated 2-3-2009

*****

Carefully evaluate whether your SELECT query needs the DISTINCT clause or not. Some developers automatically add this clause to every one of their SELECT statements, even when it is not necessary. This is a bad habit that should be stopped.

The DISTINCT clause should only be used in SELECT statements if you know that duplicate returned rows are a possibility, and that having duplicate rows in the result set would cause problems with your application.

The DISTINCT clause creates a lot of extra work for SQL Server, and reduces the physical resources that other SQL statements have at their disposal. Because of this, only use the DISTINCT clause if it is necessary. [2000, 2005, 2008] Updated 2-3-2009

*****

You may have heard of a SET command called SET ROWCOUNT. Like the TOP operator, it is designed to limit how many rows are returned from a SELECT statement. In effect, the SET ROWCOUNT and the TOP operator perform the same function.

While in most cases, using either option works equally efficiently, there are some instances (such as rows returned from an unsorted heap) where the TOP operator is more efficient than using SET ROWCOUNT. Because of this, using the TOP operator is preferable to using SET ROWCOUNT to limit the number of rows returned by a query. [2000, 2005, 2008] Updated 2-3-2009

Continues…

Leave a comment

Your email address will not be published.