Fun with Numbers in Transact-SQL Queries

Recently, someone asked in a Microsoft public SQL Server newsgroup how can you sort characters within a string? For example, a string contains ‘CBA’, and he wanted to sort the characters within the string and make it ‘ABC’. Instead of going for a procedural solution, just for fun, I tried approaching it in a relational way (T-SQL specific). In this article, I’ll show you how we can make use of a numbers table for writing this kind of innovative queries. 

Let’s first create a numbers table called ‘Numbers’, with only one column called ‘Number’. The ‘Number’ column is an IDENTITY column with a primary key and clustered index defined on it. A clustered index in this case improves the performance of queries, as we will be querying this numbers table for ranges of numbers. 

The following script will drop the ‘Numbers’ table if it already exists, create it, and populate the table with numbers 1 to 8000 (We need only 8000 numbers as a char/varchar variable or column can have a maximum of 8000 characters). Notice the use of “DEFAULT VALUES”. That’s how one should insert rows into a table with a single column, and that column happens to be the IDENTITY column. I’m using the IDENTITY column here, just for the sake of convenience and simplicity, and there’s nothing stopping you from using a smallint column and insert 8000 rows by incrementing a variable.

SET NOCOUNT ON
GO

IF EXISTS
(
SELECT 1
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = ‘Numbers’
AND TABLE_SCHEMA = ‘dbo’
AND TABLE_TYPE = ‘BASE TABLE’
)
BEGIN
DROP TABLE dbo.Numbers
END
GO

CREATE TABLE dbo.Numbers
(
Number smallint IDENTITY(1, 1) PRIMARY KEY
)
GO

WHILE 1 = 1
BEGIN
INSERT INTO dbo.Numbers DEFAULT VALUES

IF @@IDENTITY = 8000 
BEGIN
BREAK
END
END
GO

Now let us address the problem at hand: “Sorting the characters within a string.” 

The following script will use the Numbers table to split the string into individual characters and create a derived table, from which it creates a sorted string using an aggregate concatenation query. This is a much better approach towards this problem, compared to you implementing a sorting algorithm in your scripts. Why reinvent the wheel when SQL Server can do the required sorting?

DECLARE @input varchar(100), @output varchar(100), @len smallint
SET @input = ‘CDBEA’
SET @output = ”
SET @len = LEN(@input)

SELECT @output = @output + Val 
FROM 
(
SELECT TOP 100 PERCENT SUBSTRING(@input, Number, 1) AS Val
FROM dbo.Numbers (NOLOCK)
WHERE Number <= @len
ORDER BY Val
) AS Derived

SELECT @input AS ‘Original string’, 
@output AS ‘Sorted string’

This idea can be extended for other scenarios. How about extracting the unique characters from a given string? That is, given an input of ‘abbcccdddd’, output ‘abcd’. The following script will do exactly that, using the DISTINCT keyword. Again, this approach is making use of SQL Server’s ability to return only distinct rows from a given set. It would be a complicated task if you have to implement this ability on your own.

DECLARE @input varchar(100), @output varchar(100), @len smallint
SET @input = ‘ABBCCCDDDD’
SET @output = ”
SET @len = LEN(@input)

SELECT @output = @output + Val
FROM
(
SELECT DISTINCT TOP 100 PERCENT SUBSTRING(@input, Number, 1) AS Val
FROM dbo.Numbers (NOLOCK)
WHERE Number <= @len
ORDER BY Val
) AS Derived

SELECT @input AS ‘Original string’,
@output AS ‘Sorted string with UNIQUE characters only’

Continues…

Leave a comment

Your email address will not be published.