Eliminate the Use of Temporary Tables For HUGE Performance Gains

As queries become more complex, temporary tables are used more and more. While temporary table may sometimes be unavoidable, they can often be sidestepped by using derived tables instead. In brief, a derived table is the result of using another SELECT statement in the FROM clause of a SELECT statement. By using derived tables instead of temporary tables, we can boost our application’s performance. Let’s find out more.

How the Use of Temporary Tables Affect Performance

Temporary tables slow performance dramatically. The problem with temporary tables is the amount of overhead that goes along with using them. In order to get the fastest queries possible, our goal must be to make them do as little work as possible. For example, with a SELECT statement, SQL Server reads data from the disk and returns the data. However, temporary tables require the system to do much more.

For example, a piece of Transact-SQL code using temporary tables usually will:

1) CREATE the temporary table
2) INSERT data into the newly created table
3) SELECT data from the temporary table (usually by JOINing to other physical tables) while holding a lock on the entire tempdb database until the transaction has completed. 
4) DROP the temporary table

This represents a lot of disk activity, along with the potential for contention problems. And all of this adds up to poor performance.

Eliminate A Few Steps!

The biggest benefit of using derived tables over using temporary tables is that they require fewer steps, and everything happens in memory instead of a combination of memory and disk. The fewer the steps involved, along with less I/O, the faster the performance.

Here are the steps when you use a temporary table:

1) Lock tempdb database
2) CREATE the temporary table (write activity)
3) SELECT data & INSERT data (read & write activity)
4) SELECT data from temporary table and permanent table(s) (read activity)
5) DROP TABLE (write activity)
4) Release the locks

Compare the above to the number of steps it takes for a derived table:

1) CREATE locks, unless isolation level of “read uncommitted” is used
2) SELECT data (read activity)
3) Release the locks

As is rather obvious from this example, using derived tables instead of temporary tables reduces disk I/O and can boost performance. Now let’s see how.

Using Derived Tables

Derived tables are essentially SELECT statements within SELECT statements. Let’s look at a very simple example:

Take a look at this simple query where we SELECT data from a table:

USE northwind

GO

SELECT * FROM categories

Now, instead of selecting data from the categories table, let’s select our data from a derived table. For example:

USE northwind

GO

SELECT * FROM (SELECT * FROM categories) dt_categories

This is all there is to derived tables. Remember, a derived table is just the result of using another SELECT statement in the FROM clause of another SELECT statement. Simply put the query in parenthesis and add a table name after the query in the parenthesis.

Both of the above examples produce the exact same results. This example is designed to show you what a derived table is, and how easy they are to create. In the real world, if you needed to write a query like the one above, you would of course use the simpler of the two examples. But if your query is complex, consider using a derived table instead of a temporary table, as we will see in the following real world example.

Continues…

Pages: 1 2




Related Articles :

  • No Related Articles Found

6 Responses to “Eliminate the Use of Temporary Tables For HUGE Performance Gains”

  1. This advice is ONLY applicable to OLTP workloads (e.g., small quick transactions). SQL Server “derived tables” (subqueries to everyone else) are **highly** optimised for small SELECTs.

    In a DW / ETL / OLAP type workloads the exact **opposite** advice applies. In DW processing I often see HUGE performance gains from converting selects to temp tables.

  2. This is completely misleading. Everything exists for a reason or MS wouldn’t have invented temp tables. Derived tables are good for small quantity of data and select operation. When the data volumn is huge and often combined with operations like Group by, temp table can be lot faster by creating and using indexes.

  3. I agree with Joe as we have experienced the exact opposite effect with large result sets and fairly complex aggregation queries (derived tables) joining against each other. We eliminated the use of derived tables by placing the data into temp tables and took a query that was running for close to 11 hours down to running under 11 seconds. The execution plan indicated the optimizer had no idea what to expect from the derived tables as the actual number of rows was very different from the estimated number of rows.

  4. I have the same experience with Derived tables (Subqueries).

    An average select query performs nicely when retrieving all 82000 rows. Execution time is 1 second.

    Then I was asked to sum 1 column and to group it on the 2 remaining integer columns.

    I tried a derived table, Sum the main query and a temp table.

    The Temp Table keeps it in 1 second. The other 2 solutions are within 4/5 minutes.

    Thus my question; has any1 have any experience of forcing the query execution plan of the derived table?

  5. I think the exact opposite for general sql server queries. I have tried a query returning around 7000 rows both using temporary table(even without index) and using derived table.

    The difference is significant. Using temporary table takes literally 0ms while the derived table version takes at least 17ms.

    One of the bottleneck of the temporary table is of course the locking over entire tempdb. But it never implies that we should avoid that. I suggestion is to use temporary table whenever possible only keeping the locking issue under consideration.

  6. I’ll ditto what others have said. I use both derived and temp tables. I run a bunch of procedures start-of-day to double-check some info. In cases where it’s a small list of stuff to compile for reference, a derived table works like a charm. But, I found noticeable improvements when I decided to pre-compile a lot of aggregate data early as temp tables which are used in a lot of later queries. I don’t just make #temp tables, though, I make actual tables in the tempdb. I use them through-out the day while working on other queries, or ad-hoc’ing something on request. The checks I run used to take 5 minutes total every day. After I started temp tableing larger things, it’s down to 1:30, and that includes the time to temp-table the stuff. As for wear-n-tear on disk I/O and things … when I do the math for how much time I’m saving, times by how much I get paid by minute … the company can afford a little wear-n-tear on a hard drive compared to me twiddling my thumbs for an extra 3:30. (Besides, that 3:30 is enough time for me to get distracted in something else, which kills productivity totally.)

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |