New T-SQL Features in SQL Server 2012

Paging Data

Paging is a big issue for developers as it is required for many applications but entails many performance problems.  Developers have used different workarounds to support paging.

For example, assume we need to display several pages of the HumanResource.Employee object with 5 rows per page. Below is the query you need to execute this in SQL Server 2012.

DECLARE @page_index SMALLINT ,
@page_Size SMALLINT = 5,
@offset INT
SET @page_index = 2
SET @offset = ( (@page_index - 1) * @page_Size)
SELECT   BusinessEntityID,
         NationalIDNumber,
         JobTitle
FROM HumanResources.Employee
ORDER BY BusinessEntityID
OFFSET @offset ROWS
FETCH NEXT @page_Size ROWS ONLY

In this query you will note that there are two additional syntaxes added to the well-known SELECT query. 

OFFSET <offset> ROWS

The number of rows to exclude from the top.

FETCH NEXT <page_size> ROWS ONLY

The number of rows to display.

So in the results you will only receive the data you need.

You must remember that ORDER BY clause is a must, you can have multiple columns for the ORDER BY clause.

SELECT SOH.SalesOrderID
       ,SOD.SalesOrderDetailID
       ,P.Name As ProductNam
       FROM
Sales.SalesOrderDetail SOD INNER JOIN Sales.SalesOrderHeader SOH
ON SOD.SalesOrderID = SOH.SalesOrderID
INNER JOIN Production.Product P ON P.ProductID = SOD.ProductID
ORDER BY SOH.SalesOrderID,SOD.SalesOrderDetailID DESC
OFFSET 100 ROWS
FETCH NEXT 25 ROWS ONLY

The query plan for this will appear as below:

As you can see, the Top operator will be introduced at an almost zero percent cost. 

Lag & Lead

This is another T-SQL feature added to SQL Server 2012 addresses an issue in reporting.

Let me describe this feature with a real world example.

Below are the sales amounts for one month.

The above data was generated from the FactInternetSales table in the AdventureWorksDWDenali database. You can find the same script in the 5528EN _02_03.sql file in the code folder.

For reports you need this month’s sales and last month’s sales along with the difference between the two months.

If you were asked to write a query for this, you query would look like:

SELECT
 t1.Year,
 t1.Month,
 t1.SalesAmount as SalesAmountThisMonth,
 t2.SalesAmount as SalesAmountLastMonth,
 t1.SalesAmount - t2.SalesAmount SalesDifferent
FROM MonthlySales as t1
LEFT OUTER JOIN MonthlySales as t2
ON (t1.Year = t2.Year) AND (t1.Month = t2.Month+1)
ORDER BY Year, Month

You need to use LEFT OUTER JOIN as INNER JOIN will not give any rows for the very first month in the data set.

Obviously, there will be a either an index scan or a table scan (depending whether you have an index or not) twice for the table which is costly.

With SQL Server 2012, the new T-SQL function LAG is introduced just to address the above scenario:

SELECT
 Year,
 Month,
 SalesAmount as SalesAmountThisMonth,
 LAG(SalesAmount, 1, NULL) OVER(ORDER BY Year, Month) as SalesAmountLastMonth,
 SalesAmount - LAG(SalesAmount, 1, 0) OVER(ORDER BY Year, Month) SalesDifferent
FROM MonthlySales

As shown in the above example, the LAG analytical function has three parameters. The first parameter is the expression which is the column in this scenario.  The second parameter is the offset, since we are looking for last month this parameter is 1.  If you want the last year’s comparative month’s data to be displayed then this parameter should be 12. The third parameter is the default, in case there is no match for the previous month, default value will be displayed.

So the output of the above query will be:

So how does this new syntax compare with the pre-2012 syntax? The simple way to compare them is running both queries in one batch and check the execution plan as shown in the below image. In the batch, the first query is the old style query and second query is the one using the LAG function.

Relative cost wise, the first query (old style LEFT OUTER JOIN) cost is 77% whereas the query which contains LAG cost is 23%. These queries were analyzed for Reads and duration using the SQL Profiler.

Below are the results for both queries executed after clearing the cache.

Reads

Duration (ms)

LEFT OUTER JOIN

125

8

LAG

46

3

 

Obviously using LAG has a clear performance advantage. However, to achieve better performance you need to cover those columns (in the above scenario, it will be Year and Month).

You can also use the LAG function for grouping. For example, if you want product model sales differenciated by name you could use the below query.

SELECT
 ModelName,
 Year,
 Month,
 SalesAmount as SalesAmountThisMonth,
 LAG(SalesAmount, 1, NULL) OVER(PARTITION BY ModelName ORDER BY Year, Month) as SalesAmountLastMonth,
 SalesAmount - LAG(SalesAmount, 1, 0) OVER(PARTITION BY ModelName ORDER BY Year, Month) SalesDifferent
FROM ModelNameWiseMonthlySales

And the result is:

From the above results, you will see that the Sales Amount for the previous month is taken only for the model name.

LEAD is another new function which is similar to LAG but instead shows the future records.

SELECT
 Year,
 Month,
 SalesAmount as SalesAmountThisMonth,
 LAG(SalesAmount, 1, NULL) OVER(ORDER BY Year, Month) as SalesAmountLastMonth,
 LEAD(SalesAmount, 1, NULL) OVER(ORDER BY Year, Month) as SalesAmountNextMonth
FROM MonthlySales

As you can see from the below screenshot, the above query will give you the previous month’s and next month’s sales values in a single query.

FIRST_VALUE & LAST_VALUE

FIRST_VALUE will return the first value while LAST_VALUE will return the last in an ordered set of values in SQL Server 2012.

Take for example, the below query.

SELECT
d.LastName,
d.GroupName,
e.JobTitle,
e.HireDate,
FIRST_VALUE(e.HireDate) OVER (PARTITION BY d.GroupName  ORDER BY e.JobTitle) AS FirstValue,
LAST_VALUE(e.HireDate) OVER (PARTITION BY d.GroupName  ORDER BY e.JobTitle) AS LastValue
 FROM HumanResources.Employee e
INNER JOIN HumanResources.vEmployeeDepartmentHistory d
ON e.BusinessEntityID = d.BusinessEntityID

This query will join the HumanResources.Employee table and the HumanResources.vEmployeeDepartmentHistory view together via BusinessEntityID. You can see that LastName, JobTitle, GroupName and Hiredate are selected from either the table or the view.

If you closely look at the LAST_VALUE and the FIRST_VALUE function, you will observe that the Partition column and the order by column (Which are HireDate and GroupName respectively) are the same.

FIRST_VALUE will give you the first value of the hiredate for each group when it orders by job title.

The results are given below:

Leave a comment

Your email address will not be published.