Common Table Expressions (CTE) for SQL Server Developers


What is A CTE?

A CTE is a temporary view that exists only in the scope of
running query. A CTE is similar to a view but a CTE is embedded into the query.
CTEs can be used in Stored Procedures, User Define Functions (UDFs), Triggers
and Views but not in Indexed Views.

Below is the syntax for a CTE :

In the database, you will have a physical table called
Sales.Customer and from the above query you are taking that table into a CTE
called AllmyCustomers which will be referred in a future query.

Difference between a CTE and Variable Table

Obviously, a CTE is different from a temporary table sice temporary
tables save data to the tempdb but the CTE will be kept in the memory. So are is the
differences between CTEs and table variables?

  • A table variable’s scope is for the batch, while a CTE’s scope is only
    for the query.
  • To populate a table variable, you need to execute insert scripts,
    but with CTEs the data will be populated at time of defining the CTE.

  • If you want to use recursion in table variables, you need to write
    your own queries but with a CTE it is much easier to write recursive queries.

  • If you are using larger data sets in table variables it will use
    tempdb to store data physically. However, a CTE always uses memory. This
    can be an issue in electing for CTEs as you need to ensure that you are not using
    larger data sets that will drain memory. The following graph shows tempdb write transactions/sec
    for temp tables, table variables and CTE scenarios.
    You can simply see
    that a CTE does not write anything to the tempdb, hence it has a higher performance compared with the
    other two options.

Usages for CTE

There are a few scenarios where using CTEs is optimal:

Recursion

Recursion is typically a difficult development task. Let us look atan example, say we have following
data set:

IF EXISTS (SELECT 1 FROM sys.tables WHERE Name = 'Employees')

DROP TABLE Employees

CREATE TABLE Employees

(EmployeeID INT PRIMARY KEY,
Name VARCHAR(50), ReportsTo INT)

The important point to remember is ReportsTo is foreign key
constraint to the EmployeeID column or as some of you call it, this table has a
self-relationship between two columns of the same data.

We will now populate this with the following sample data:

INSERT INTO Employees

VALUES

(1,'Richard',NULL),
(2,'Stephan',1),
(3,'Clemens',2),
(4,'Simmon',2),
(5,'Anderson',4),
(6,'Cris',1),
(7,'Andrew',5),
(8,'Peter',6)

If you visualize your data, it will looks as below.

Richard is the boss of the company while Stephan and Cris
report directly to Richard.

Let us say you want to find out all the employees that come under
Stephan who has an employeeid of ‘2’.

WITH EmployeeRecursion(Name,EmployeeID,ReportsTo)

AS

(
 SELECT Name,EmployeeID,ReportsTo
 From Employees WHERE EmployeeID = 2 -- @EmpID

 UNION ALL

 SELECT e.Name,e.EmployeeID,e.ReportsTo
 From Employees e INNER JOIN EmployeeRecursion r
 ON e.ReportsTo = r.EmployeeID
 )


 SELECT R.EmployeeID,R.Name,E.Name Boss
 FROM EmployeeRecursion  R
 INNER JOIN Employees E ON R.ReportsTo = E.EmployeeID

This results in the below output :

Below is the query plan for the above CTE which shows
one index scan and two index seeks. This is after adding a non-clustered index
to the ReportsTo column

In recursive queries, you can set number of levels you need
to stop recursion by applying the MAXRECURSION option.

--Max Recursion
WITH EmployeeRecursion(Name,EmployeeID,ReportsTo)

AS
(

SELECT Name,EmployeeID,ReportsTo
From Employees WHERE EmployeeID = 2

UNION ALL

SELECT e.Name,e.EmployeeID,e.ReportsTo

From Employees e INNER JOIN EmployeeRecursion r

ON e.ReportsTo = r.EmployeeID
)

 SELECT R.EmployeeID,R.Name,E.Name Boss
 FROM EmployeeRecursion  R
 INNER JOIN Employees E ON R.ReportsTo = E.EmployeeID

OPTION (MAXRECURSION 2)

Though this returns the correct values indented, error
of level 16 will be generated triggering an exception.

Msg 530, Level 16, State 1, Line 2 

The statement terminated. The maximum recursion 2 has been exhausted
before statement completion.

This will prevent users to using the MAXRECUSION option. This
issue exists in all SQL Server versions up to SQL Server 2012. You can overcome
this by, introducing a level column to the recursion as shown below.

WITH EmployeeRecursion(Name,EmployeeID,ReportsTo,Level)

AS
(
SELECT Name,EmployeeID,ReportsTo,0 Level
 From Employees WHERE EmployeeID = 2
 UNION ALL

 SELECT e.Name,e.EmployeeID,e.ReportsTo, Level + 1
 From Employees e INNER JOIN EmployeeRecursion r
 ON e.ReportsTo = r.EmployeeID

 WHERE Level <2
 )

 SELECT R.EmployeeID,R.Name,E.Name Boss
 FROM EmployeeRecursion  R
 INNER JOIN Employees E ON R.ReportsTo = E.EmployeeID

Finding Duplicates

CTEs can also assist in finding duplicates in our data by improving both performance and readability. Lets through this by an example. First we will create the below table with data

 --Finding Duplicates

IF EXISTS (SELECT 1 FROM sys.tables WHERE Name = 'Products')
DROP TABLE Products
CREATE TABLE Products

(ProductID INT PRIMARY KEY CLUSTERED,
ProductName VARCHAR(50),
Price NUMERIC(9,2)
)

INSERT INTO Products

VALUES
(1,'Mouse',20.15),
(2,'KeyBoard',15.75),
(3,'Monitor',214),
(4,'UPS',88),
(5,'Mouse',20.14),
(6,'Monitor',251),
(7,'Mouse Pad',5),
(8,'Monitor',265)

Now, you are given the task of finding duplicates:

Below is the traditional way of doing it.

SELECT * FROM Products
WHERE ProductID NOT IN

(
SELECT MIN(A.ProductID) FROM Products A
WHERE  Products.ProductName = A.ProductName
);

Using CTEs will eliminate the subquery and improve the performance:

WITH DuplicateProducts
AS

(
SELECT MIN(ProductID) AS ProductID , ProductName
FROM Products
GROUP BY ProductName
HAVING COUNT(1) > 1
)

SELECT P.ProductID,P.ProductName
FROM Products P INNER JOIN DuplicateProducts cte
ON P.ProductName = cte.ProductName
AND P.ProductID >cte.ProductID;

Pages: 1 2




Array

No comments yet... Be the first to leave a reply!