Searching with optional arguments | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Searching with optional arguments

I have a report in which the user can choose zero or more of several fields. Then I call a stored procedure with the values to the fields choosed and NULL to the others. How can I write my select in order to get good performance, without table scans? My procedure, which works but don’t use good indexes, is something like: create proc p1
@a int, @b int, @c int
as
select x,y,z
from table1 — with good index on column a, for example
where a = isnull(@a,a)
and b = isnull(@b,b)
and c = isnull(@c,c) Another way, that also don’t use good indexes, is: where ((@a is not null) and a = @a) or (@a is null))
and ((@b is not null) and b = @b) or (@b is null))
and ((@c is not null) and c = @c) or (@c is null)) Actually, my procedure has more than 10 arguments.
Anybody has a tip?
Refer http://sql-server-performance.com/forum/topic.asp?TOPIC_ID=708
http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=624 HTH. Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

It is true, in these cases indexes will not be used. you could try using sp_executesql(sql) and construct it to use first the columns that are indexed. maybe limit the user to be using at least
one indexed column. I don’t really like using too much sp_executesql mainly for security reason, but in this case it would give you better performance. if you go with this make sure you check your strings carefully. Another thing, it will not work without SELECT permissions on the tables. If you want to avoid granting permissions directly on tables you would have to create views and work on them. Or else, grant/revoke permissions at a column level. Bambola.
you should give serious consideration to write separate queries for each search argument with a good index
and use an if block IF @b IS NOT NULL
select x,y,z
from table1 — with good index on column a, for example
where a = @a
ELSE IF @b IS NOT NULL
etc, etc otherwise, the optimizer may not use indexes period, or it may use an index intersection, which is not what you want
Joe’s method doesn’t work well in my case because the user can choose any combinations of fields. Then, the if would be much complex and long. I’d trying to runaway from dynamic sql due to security problems, but up to the moment is, obsviously, the one that performs better. If somebody has the silver bullet…
Did you see Chappy’s solution inhttp://sql-server-performance.com/forum/topic.asp?TOPIC_ID=708 ? Also the solution posted by me in the other post is also on the same lines. Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

sorry, didn’t realize you said one or more in your original post.
in which case, some use of dynamic sql is probably desirable, i would still try to identify the most common parameter sets and use the if block for those, and dynamic for the rest
while the dynamic sql might be much better than table scans,
it will still require a parse and may require frequent recompiles, which is much worse than an explicit statement with a simple efficient plan for security concerns with dynamic sql, try using parameterized query with
sp_executesql
How about Chappy’s Solution… where ((@a is null) or a = @a)
and ((@b is null) or b = @b)
and ((@c is null) or c = @c)
Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

You could write special blocks for those that can be indexed and leave a more general for those that can’t This example assumes that there are 3 indexes each with a different leading column e.g.
if( @a is not null ) — most selective
select x,y,z
from table1 — with good index on column a, b, c for example
where a = @a
and b = isnull(@b,b)
and c = isnull(@c,c) else if( @b is not null )
select x,y,z
from table1 — with good index on column b, c for example
where b = @b
and c = isnull(@c,c) else if( @c is not null ) — least selective
select x,y,z
from table1 — with good index on column c, for example
where c = @c

Twan – I was thinking about the exact same thing! I would try to do it this way, espacially if query runs often and table is not that small. It should not be that difficult since code would almost repeat itself. There would, of course, be an else where you can decide if you wanna use sp_executesql with it’s implications (and advantages) or go with a table scan. Gaurav – I tried this solution before. I thought that the condition will be eveluated left to right, therefor if variable IS NULL the second condition in the OR statement will not be evaluated and I would avoid comparing variable to the table column. As far as I remember it doesn’t work like this, and no matter what value the vairable contains, you end up with table or index (if column has index) scan. EDIT: I was thinking AND and writing OR. Please ignore the second part of this post. Bambola.
Bambola, Try this…<pre>Create table Random_Data<br />(<br />col1 int Primary Key Identity (10, 10) NOT NULL,<br />col2 int NOT NULL DEFAULT CASE<br />WHEN CONVERT(int, Rand() *1000) % 2 = 1 <br />THEN (Convert(int, rand() * 100000) % 10000 * -1)<br />else Convert(int, Rand() * 100000) % 10000<br />end,<br />col3char(15) null default char((convert(int, rand() * 1000) % 26) + 65)<br />+ char((convert(int, rand() * 1000) % 26) + 65)<br />+ char((convert(int, rand() * 1000) % 26) + 65)<br />+ char((convert(int, rand() * 1000) % 26) + 65)<br />+ Replicate(char((convert(int, rand() * 1000) % 26) + 65) , 11)<br />)<br />declare @counter int<br />set @counter = 1<br />while (@counter &lt;= 10000)<br />begin <br />insert random_data default values<br />set @counter = @counter + 1<br />end<br /><br />insert into Random_data(col2, col3)<br />values (Default, null)<br /><br />create nonclustered index NCL_random_data on random_data(col3) –Create non clustered index to verify <br /> –if the index will be used in query<br /><br />declare @input char(15)<br /><br />select @input = NULL –‘KVVDYYYYYYYYYYY'<br /><br />select col1, col3<br />from random_data<br />where ((col3 is null) and (@input is null)) or (col3 = @input)</pre><br /><br />This query uses index and does non clustered index seek on my machine.<br /><br /> |–Index Seek(OBJECT<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[master].[dbo].[Random_Data].[NCL_random_data]), SEEK<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />[Random_Data].[col3]=[@input]) ORDERED FORWARD)<br /><br />Gaurav<br /><i>Moderator<br /><font size="1">Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard</i></font id="size1">
Chappy’s solution works but doesn’t perform well. It makes table or index scan. Gaurav’s solution performs well, but doesn’t solve the problem, because a NULL argument means ‘I don’t want to filter on this column’. Twan solution is a good idea. However it is not so elegant (that’s life!).
But that’s what you wanted right? ou don’t want to filter on the columns for which a value has not been selected. Am I missing something here? Can you please explain. Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

Here is what I had in mind declare @OrderID int
declare @ShipPostalCode nvarchar(10)
–select @ShipPostalCode = ‘50739’
–select @OrderID = 10508 select OrderID
from northwind.db:confused:rders
where
(
( @OrderID IS NULL )
OR
((@OrderID IS NOT NULL) AND (OrderID = @OrderID) )
)
AND
(
(@ShipPostalCode IS NULL)
OR
((@ShipPostalCode IS NOT NULL) AND (ShipPostalCode = @ShipPostalCode) )
)
This is how it should (?) be working.
If variable @OrderID IS NULL, it should move to check the next condition – @OrderID IS NOT NULL – which would be false, and exit this AND condition since false AND whatever is false. In this case it should avoid evaluating the last expression – OrderID = @OrderID, which is what I was trying to achieve. I still have to figure out if it is actually evaluating it or not. If @OrderID IS NOT NULL, it will still check the second condition since they are in OR, @OrderID IS NOT NULL would evaluate to true, and it should check OrderID = @OrderID and return corrisponding rows. So far so good, and it can be a solution for these kind of problem if I could get it to work like I want to 😛 But… Both OrderID and ShipPostalCode are indexed. But no matter what combination of values I use select @ShipPostalCode = ‘50739’
select @OrderID = NULL select @ShipPostalCode = NULL
select @OrderID = 10508 select @ShipPostalCode = ‘50739’
select @OrderID = 10508 it always does index scan. It’s better than a table scan but I would have liked to know – is there anyway to see here index seek???
Gaurav – I have tried the code bellow and i still get index scan. What do you get when you run it? select OrderID
from northwind.db:confused:rders
where
((@OrderID is null) or OrderID = @OrderID)
and
((@ShipPostalCode is null) or ShipPostalCode = @ShipPostalCode) Bambola. If we are to achieve results never before accomplished, we must expect to employ methods never before attempted. – Francis Bacon

Gaurav,
Using the table in your example, all rows should be returned when @input is null, meaning ‘I don’t care about col3 values’. However, due to ‘col3 is null’ expression, when @input is null, just the rows with col3 = NULL are returned. In your example, try this to see what I mean: select @input = NULL Your statement:
select col1, col3
from random_data
where ((col3 is null) and (@input is null)) or (col3 = @input) returns just 1 row
My statement:
select col1, col3
from random_data
where ((@input is null)) or (col3 = @input) returns all rows, that is what I want, but does index scan.
I ran the query for 10000 records. The northwind database has 830 rows. Also the ShipPostalCode does not have any index. So I guess the query optimizer has chosen the table scan as the best option. My case had non clustered index for the search column. Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

Now we are getting where the 3 way behaviour of NULL is taking a toll on us. See the think is that you have to consider NULL as a data value or a empty data. If you consider it as a data value, the requirements can’t be met using my method. But if you use it as an empty value, then u’ll have to convert all NULL’s present in the database into some default value which will remove this confusion.
NULL is one thing in SQL Server which can introduce many bugs and make the life of programmer HELL. Gaurav
Moderator
Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

]]>