Site sponsored by: Idera The gold standard of SQL Server performance monitoring & diagnostics.
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Tip Topics

All Tips
ADO.NET / ASP.NET
Administration
Analysis/OLAP Services
Application Development
Configuration
Components
ETL
Hardware
High Availability
Hints
Index
Misc
Operating Systems
Performance Tuning
Replication
T-SQL
Views

Write for Us

Share your SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

Recover Data Using Database Snapshots
Analyze and Fix Index Fragmentation in SQL Server 2008
Powerful Geographical Visualisations made easy with SQL 2008 Spatial (Part 2) ...
Backup User Databases Using a Maintenance Plan

More     
 
Latest FAQ's

How to alter a User Defined Data Type?
How to unzip a File in SSIS?
How to view previous query plans?
ALTER TABLE SWITCH statement failed because the object '%.*ls' is not ...

More     
   
Latest Software Reviews

Spotlight on ApexSQL Doc 2008
ApexSQL Enforce
Embarcadero Change Manager
SQL Server DBA Dashboard

More     

tips >> t-sql >> Performance Tuning SQL Server Joins

Performance Tuning SQL Server Joins

By : Brad McGehee
Apr 13, 2006

One of the best ways to boost JOIN performance is to limit how many rows need to be JOINed. This is especially beneficial for the outer table in a JOIN. Only return absolutely only those rows needed to be JOINed, and no more. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

If you perform regular joins between two or more tables in your queries, performance will be optimized if each of the joined columns have their own indexes. This includes adding indexes to the columns in each table used to join the tables. Generally speaking, a clustered key is better than a non-clustered key for optimum JOIN performance. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

If you have two or more tables that are frequently joined together, then the columns used for the joins on all tables should have an appropriate index. If the columns used for the joins are not naturally compact, then considering adding surrogate keys to the tables that are compact in order to reduce the size of the keys, thus decreasing read I/O during the join process, increasing overall performance. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

JOIN performance has a lot to do with how many rows you can stuff in a data page. For example, let's say you want to JOIN two tables. Most likely, one of these two tables will be smaller than the other, and SQL Server will most likely select the smaller of the two tables to be the inner table of the JOIN. When this happens, SQL Server tries to put the relevant contents of this table into the buffer cache for faster performance. If there is not enough room to put all the relevant data into cache, then SQL Server will have to use additional resources in order to get data into and out of the cache as the JOIN is performed.

If all of the data can be cached, the performance of the JOIN will be faster than if it is not. This comes back to the original statement, that the number of rows in a table can affect JOIN performance. In other words, if a table has no wasted space, it is much more likely to get all of the relevant inner table data into cache, boosting speed. The moral to this story is to try to get as much data stuffed into a data page as possible. This can be done through the use of a high fillfactor, rebuilding indexes often to get rid of empty space, and to optimize datatypes and widths when creating columns in tables. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

Keep in mind that when you create foreign keys, an index is not automatically created at the same time. If you ever plan to join a table to the table with the foreign key, using the foreign key as the linking column, then you should consider adding an index to the foreign key column. An index on a foreign key column can substantially boost the performance of many joins. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

Avoid joining tables based on columns with few unique values. If columns used for joining aren’t mostly unique, then the SQL Server optimizer may not be able to use an existing index in order to speed up the join. Ideally, for best performance, joins should be done on columns that have unique indexes. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

For best join performance, the indexes on the columns being joined should ideally be numeric data types, not CHAR or VARCHAR, or other non-numeric data types. The overhead is lower and join performance is faster. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

For maximum performance when joining two or more tables, the indexes on the columns to be joined should have the same data type, and ideally, the same width.

This also means that you shouldn't mix non-Unicode and Unicode datatypes. (e.g. VARCHAR and NVARCHAR). If SQL Server has to implicitly convert the data types to perform the join, this not only slows the joining process, but it also could mean that SQL Server may not use available indexes, performing a table scan instead. [6.5, 7.0, 2000, 2005] Updated 7-25-2005

*****

When you create joins using Transact-SQL, you can choose between two different types of syntax: either ANSI or Microsoft. ANSI refers to the ANSI standard for writing joins, and Microsoft refers to the old Microsoft style of writing joins. For example:

ANSI JOIN Syntax

SELECT fname, lname, department
FROM names INNER JOIN departments ON names.employeeid = departments.employeeid

Former Microsoft JOIN Syntax

SELECT fname, lname, department
FROM names, departments
WHERE names.employeeid = departments.employeeid

If written correctly, either format will produce identical results. But that is a big if. The older Microsoft join syntax lends itself to mistakes because the syntax is a little less obvious. On the other hand, the ANSI syntax is very explicit and there is little chance you can make a mistake.

For example, I ran across a slow-performing query from an ERP program. After reviewing the code, which used the Microsoft JOIN syntax, I noticed that instead of creating a LEFT JOIN, the developer had accidentally created a CROSS JOIN instead. In this particular example, less than 10,000 rows should have resulted from the LEFT JOIN, but because a CROSS JOIN was used, over 11 million rows were returned instead. Then the developer used a SELECT DISTINCT to get rid of all the unnecessary rows created by the CROSS JOIN. As you can guess, this made for a very lengthy query. I notified the vendor's support department about it, and they fixed their code.

The moral of this story is that you probably should be using the ANSI syntax, not the old Microsoft syntax. Besides reducing the odds of making silly mistakes, this code is more portable between database, and eventually, I imagine Microsoft will eventually stop supporting the old format, making the ANSI syntax the only option. [6.5, 7.0, 2000, 2005] Updated 11-1-2005


    Next Page>>    








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 1999-2008 by T10 Media. All rights reserved