Select Based on List of Random Values — Dynamic SQL or CSV Parsing UDF?

Two frequently asked related questions in Microsoft SQL Server discussion groups are:

  1. How to pass an array to a stored procedure?
  2. How to select data based on a list of values?

Examples:

  1. User wants to search for jobs based on a list of desired locations.
  2. User marks e-mails in an inbox for deletion.

SQL data set based processing doesn’t support well such random-list based criteria. Sure, there is the IN operator, but how do you pass such a list to a stored procedure and how do you achieve a good execution plan without a recompilation each time the query is executed?

The Problem

Let us prepare a concrete scenario in order to be able to test and compare different solutions. Let’s define a table structure and a stored procedure that handles orders that may have different statuses during its life cycle. For example, an order may be submitted, reviewed, approved, waiting for approval, in process, rejected, closed, and so on.

Let’s define the orderStatuses table first:

create table dbo.OrderStatuses(
     orderStatusID     tinyInt primary key,
     orderStatusName     varchar(30),
     orderStatusDescription     varchar(255)
)

For testing purposes, we can use a simplified orders table that contains references to the user table, orderStatuses table, and additional info represented by varchar(200) column:

create table orders(
     orderID     int identity(1,1) primary key clustered,
     userID     int not null,
     orderStatusID     tinyInt not null,
     orderInfo     varchar(200)
)

In this article we will review a few implementations of the stored procedure that returns a count of orders for a given user and a list of statuses. The stored procedure structure is shown below:

create procedure orders_sel_by_userID_orderStatusList
     @userId     int,
     @orderStatusList     varchar(100)
as begin
     set noCount on
<implementation code here>
end

To populate the table we can use the script below:

set noCount on
declare @StatusCount     int     set @StatusCount = 7
declare @UserCount     int     set @UserCount = 50 declare @Count     int     set @count = @StatusCount * @UserCount * 10000
declare @i     int     set @i = 1 while @i <= @count begin
     insert into orders(userID, orderStatusId, orderInfo) values(@i % @UserCount + 1, @i % @StatusCount + 1, ‘IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII’)
     set @i = @i + 1
     if @i % 10000 = 0 print @i
end
set noCount off

Finally, we need an index on userID and orderStatusID:

create index orders_UserID_orderStatusID on orders(userID, orderStatusID)

To test performance of each implementation I used this script:
<test script> (ZIP – 3.12 KB)

I tested performance when just one statusID is passed, when two and three statuses are passed, and when different parameters are passed in subsequent executions.


Continues…

Leave a comment

Your email address will not be published.