Ranking Functions and Performance in SQL Server 2005

However, when you run the queries, the second result will be wrong:

Query 1 retrieves the correct result:

RankNum

denseRankNum

orderID

1

1

4

1

1

4

1

1

4

4

2

7

5

3

11

5

3

11

7

4

15

8

5

21

Query 2 retrieves the wrong result:

rankNum

denseRankNum

orderID

1

1

7

1

1

11

1

1

4

1

1

21

1

1

15

1

1

11

1

1

4

1

1

4

Even though the expressions in the “order by” clause help to skip sorting, they can’t be applied to the RANK() and DENSE_RANK() functions. Apparently, these ranking functions must have a sorted input to produce the correct result.

Now let’s look at the NTILE() function:

Listing 5. NTILE() function with expressions in an ORDER BY clause.
— Query 1: ORDER BY orderID.
SELECT NTILE(3) OVER (ORDER BY orderID) AS NTileNum, orderID
FROM RankingFunctions;
GO
— Query 2: ORDER BY expression.
SELECT NTILE(3) OVER (ORDER BY (SELECT 1)) AS NTileNum, orderID
FROM RankingFunctions;

Analyzing the execution plans for both queries (see Figure 5), you will find that:

  • The second query skips sorting, meaning the solution is working.
  • The results of both queries are correct.
  • The optimizer is using Nested Loops, which in some situations can be heavy.

Performance of Ranking Functions

Now, when you know how to avoid sorting in ranking functions you can test their performance.

Let’s insert more rows into the RankingFunctions table (Listing 6):

Listing 6. Insert more rows into the RankingFunctions table.
IF EXISTS (SELECT * FROM sys.objects
     WHERE object_id = OBJECT_ID(N'[dbo].[RankingFunctions]’) AND type in (N’U’))
DROP TABLE RankingFunctions

SET NOCOUNT ON
CREATE TABLE RankingFunctions(orderID int NOT NULL);
INSERT INTO RankingFunctions VALUES(7);
INSERT INTO RankingFunctions VALUES(11);
INSERT INTO RankingFunctions VALUES(4);
INSERT INTO RankingFunctions VALUES(21);
INSERT INTO RankingFunctions VALUES(15);

DECLARE @i as int, @LoopMax int, @orderIDMax int;
SELECT @i = 1, @LoopMax = 19;
WHILE (@i <= @LoopMax)
BEGIN
     SELECT @orderIDMax = MAX(orderID) FROM RankingFunctions;
     INSERT INTO RankingFunctions(OrderID)
     SELECT OrderID + @orderIDMax FROM RankingFunctions;
     SELECT @i = @i + 1;
END

SELECT COUNT(*) FROM RankingFunctions;
— 2,621,440.

UPDATE RankingFunctions
     SET orderID = orderID/5
     WHERE orderID%5 = 0;

The INSERT and SELECT parts of the INSERT…SELECT statement are using the same RankingFunctions table.

The number of generated rows can be calculated as:

generated rows number = initial rows number * power(2, number of loop iterations)

Since RankingFunctions initially has 5 rows and @LoopMax = 19, the number of generated rows will be:

5 * POWER(2,19) = 2,621,440

To increase the entropy in the row order, I changed (updated) the orderID values in the rows where orderID can be divided by 5 without the remainder.

Continues…

Leave a comment

Your email address will not be published.