Eliminate the Use of Temporary Tables For HUGE Performance Gains

An Example: Rewriting A Stored Procedure Using Temp Tables 

For this example, we will use the Northwind database. The problem we want to solve is that we need a listing of categories and products in the Northwind database, with a column stating how many products are in each category. This is based on a real case where I needed the count of products within each category. For the curious, the reason was so that I could create JavaScript dynamically to populate a second listbox on the fly.

The desired output looks something like this:

Category Name> Product Name> Category Count>
Beverages Outback Lager 2
Beverages Chang 2
Condiments Aniseed Syrup 3
Condiments Cajun Seasoning 3
Condiments Gumbo Mix 3


This result says that for the category beverages, there are two products.

The first query that follows, which is designed to produce the required results, is slow and uses a temporary table.


SELECT GETDATE()

GO

– CREATE OUR TEMPORARY TABLE

CREATE TABLE #Temp_Example ( 
     [CategoryID] INT NOT NULL, 
     [Category_Count] INT NOT NULL
)

– INSERT THE VALUES WE LATER NEED INTO THE TEMP TABLE

INSERT INTO #Temp_Example (CategoryID, Category_Count)
SELECT C.CategoryID, COUNT(*) AS Category_Count
FROM Categories C
INNER JOIN Products P ON C.CategoryID = P.CategoryID
GROUP BY C.CategoryID, C.CATEGORYNAME

– JOIN ON THE TEMP TABLE TO GET OUR VALUES

SELECT C.CategoryID, C.CategoryName, P.ProductName, P.UnitPrice, #Temp_Example.Category_Count
FROM Categories C
INNER JOIN Products P ON C.CategoryID = P.CategoryID
INNER JOIN #Temp_Example ON C.CategoryID = #Temp_Example.CategoryID
ORDER BY C.CategoryName

– DROP TEMPORARY TABLE

DROP TABLE #Temp_Example

GO

SELECT GETDATE()


Now, let’s take a look at a query that produces the same result, but does not use a temporary table, but instead uses a derived table.

SELECT GETDATE()

GO

– NOTE HOW WE SIMPLY JOIN ON THE TABLE CREATED IN MEMORY BASED ON THE CATEGORY ID

SELECT C.CategoryID, C.CategoryName, P.ProductName, P.UnitPrice, CT.Category_Count
FROM Categories C
INNER JOIN Products P ON C.CategoryID = P.CategoryID
INNER JOIN ( 
            SELECT C.CategoryID, COUNT(*) AS Category_Count
            FROM Categories C 
            INNER JOIN Products P ON C.CategoryID = P.CategoryID
            GROUP BY C.CategoryID, C.CategoryName 
            )CT ON C.CategoryID = CT.CategoryID
ORDER BY C.CategoryName

GO

SELECT GETDATE()


The first query’s performance is dramatically improved with little effort by using a derived table, as demonstrated in the second query.

Here are the steps we took to improve the queries performance:

1. We took the SELECT query from our SELECT INTO query, and put it in parentheses followed by a table name.
2. We joined on the table now in memory, rather than a temporary table.
3. We changed the name of the column selected in our SELECT statement to the name of the column in the nested query.

As you can see, using derived tables is a great way to solve complex business problems!

Give It a Test

I recently read the chapter on advanced queries in the book, Professional SQL Server 2000 Programming, and it states that if your result is going to be large, then it might be better to create a temporary table with an index, because derived tables do not have an index. Don’t let the last statement scare you, I’ve seen a few queries go from using temporary tables or cursors to derived tables and the performance was increased by 50% or greater. As always, you will never know until you test!

Pages: 1 2




Related Articles :

  • No Related Articles Found

7 Responses to “Eliminate the Use of Temporary Tables For HUGE Performance Gains”

  1. This advice is ONLY applicable to OLTP workloads (e.g., small quick transactions). SQL Server “derived tables” (subqueries to everyone else) are **highly** optimised for small SELECTs.

    In a DW / ETL / OLAP type workloads the exact **opposite** advice applies. In DW processing I often see HUGE performance gains from converting selects to temp tables.

  2. This is completely misleading. Everything exists for a reason or MS wouldn’t have invented temp tables. Derived tables are good for small quantity of data and select operation. When the data volumn is huge and often combined with operations like Group by, temp table can be lot faster by creating and using indexes.

  3. I agree with Joe as we have experienced the exact opposite effect with large result sets and fairly complex aggregation queries (derived tables) joining against each other. We eliminated the use of derived tables by placing the data into temp tables and took a query that was running for close to 11 hours down to running under 11 seconds. The execution plan indicated the optimizer had no idea what to expect from the derived tables as the actual number of rows was very different from the estimated number of rows.

  4. I have the same experience with Derived tables (Subqueries).

    An average select query performs nicely when retrieving all 82000 rows. Execution time is 1 second.

    Then I was asked to sum 1 column and to group it on the 2 remaining integer columns.

    I tried a derived table, Sum the main query and a temp table.

    The Temp Table keeps it in 1 second. The other 2 solutions are within 4/5 minutes.

    Thus my question; has any1 have any experience of forcing the query execution plan of the derived table?

  5. I think the exact opposite for general sql server queries. I have tried a query returning around 7000 rows both using temporary table(even without index) and using derived table.

    The difference is significant. Using temporary table takes literally 0ms while the derived table version takes at least 17ms.

    One of the bottleneck of the temporary table is of course the locking over entire tempdb. But it never implies that we should avoid that. I suggestion is to use temporary table whenever possible only keeping the locking issue under consideration.

  6. I’ll ditto what others have said. I use both derived and temp tables. I run a bunch of procedures start-of-day to double-check some info. In cases where it’s a small list of stuff to compile for reference, a derived table works like a charm. But, I found noticeable improvements when I decided to pre-compile a lot of aggregate data early as temp tables which are used in a lot of later queries. I don’t just make #temp tables, though, I make actual tables in the tempdb. I use them through-out the day while working on other queries, or ad-hoc’ing something on request. The checks I run used to take 5 minutes total every day. After I started temp tableing larger things, it’s down to 1:30, and that includes the time to temp-table the stuff. As for wear-n-tear on disk I/O and things … when I do the math for how much time I’m saving, times by how much I get paid by minute … the company can afford a little wear-n-tear on a hard drive compared to me twiddling my thumbs for an extra 3:30. (Besides, that 3:30 is enough time for me to get distracted in something else, which kills productivity totally.)

  7. I have the same experience as above.
    temp table really fast during complex select statement.
    and also very fast during batch insert/update.

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |