SQL Server Performance

Searching with optional arguments

Discussion in 'T-SQL Performance Tuning for Developers' started by nardi, Aug 20, 2003.

  1. nardi New Member

    I have a report in which the user can choose zero or more of several fields. Then I call a stored procedure with the values to the fields choosed and NULL to the others. How can I write my select in order to get good performance, without table scans?

    My procedure, which works but don't use good indexes, is something like:

    create proc p1
    @a int, @b int, @c int
    as
    select x,y,z
    from table1 -- with good index on column a, for example
    where a = isnull(@a,a)
    and b = isnull(@b,b)
    and c = isnull(@c,c)

    Another way, that also don't use good indexes, is:

    where ((@a is not null) and a = @a) or (@a is null))
    and ((@b is not null) and b = @b) or (@b is null))
    and ((@c is not null) and c = @c) or (@c is null))

    Actually, my procedure has more than 10 arguments.
    Anybody has a tip?
  2. gaurav_bindlish New Member

  3. bambola New Member

    It is true, in these cases indexes will not be used. you could try using sp_executesql(sql) and construct it to use first the columns that are indexed. maybe limit the user to be using at least
    one indexed column. I don't really like using too much sp_executesql mainly for security reason, but in this case it would give you better performance. if you go with this make sure you check your strings carefully. Another thing, it will not work without SELECT permissions on the tables. If you want to avoid granting permissions directly on tables you would have to create views and work on them. Or else, grant/revoke permissions at a column level.

    Bambola.
  4. joechang New Member

    you should give serious consideration to write separate queries for each search argument with a good index
    and use an if block

    IF @b IS NOT NULL
    select x,y,z
    from table1 -- with good index on column a, for example
    where a = @a
    ELSE IF @b IS NOT NULL
    etc, etc

    otherwise, the optimizer may not use indexes period, or it may use an index intersection, which is not what you want
  5. nardi New Member

    Joe's method doesn't work well in my case because the user can choose any combinations of fields. Then, the if would be much complex and long.

    I'd trying to runaway from dynamic sql due to security problems, but up to the moment is, obsviously, the one that performs better.

    If somebody has the silver bullet...
  6. gaurav_bindlish New Member

  7. joechang New Member

    sorry, didn't realize you said one or more in your original post.
    in which case, some use of dynamic sql is probably desirable,

    i would still try to identify the most common parameter sets and use the if block for those, and dynamic for the rest
    while the dynamic sql might be much better than table scans,
    it will still require a parse and may require frequent recompiles, which is much worse than an explicit statement with a simple efficient plan

    for security concerns with dynamic sql, try using parameterized query with
    sp_executesql
  8. gaurav_bindlish New Member

    How about Chappy's Solution...

    where ((@a is null) or a = @a)
    and ((@b is null) or b = @b)
    and ((@c is null) or c = @c)




    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard
  9. Twan New Member

    You could write special blocks for those that can be indexed and leave a more general for those that can't

    This example assumes that there are 3 indexes each with a different leading column

    e.g.


    if( @a is not null ) -- most selective
    select x,y,z
    from table1 -- with good index on column a, b, c for example
    where a = @a
    and b = isnull(@b,b)
    and c = isnull(@c,c)

    else if( @b is not null )
    select x,y,z
    from table1 -- with good index on column b, c for example
    where b = @b
    and c = isnull(@c,c)

    else if( @c is not null ) -- least selective
    select x,y,z
    from table1 -- with good index on column c, for example
    where c = @c
  10. bambola New Member

    Twan - I was thinking about the exact same thing! I would try to do it this way, espacially if query runs often and table is not that small. It should not be that difficult since code would almost repeat itself. There would, of course, be an else where you can decide if you wanna use sp_executesql with it's implications (and advantages) or go with a table scan.

    Gaurav - I tried this solution before. I thought that the condition will be eveluated left to right, therefor if variable IS NULL the second condition in the OR statement will not be evaluated and I would avoid comparing variable to the table column. As far as I remember it doesn't work like this, and no matter what value the vairable contains, you end up with table or index (if column has index) scan.

    EDIT: I was thinking AND and writing OR. Please ignore the second part of this post.

    Bambola.
  11. gaurav_bindlish New Member

    Bambola, Try this...<pre>Create table Random_Data<br />(<br />col1 int Primary Key Identity (10, 10) NOT NULL,<br />col2 int NOT NULL DEFAULT CASE<br />WHEN CONVERT(int, Rand() *1000) % 2 = 1 <br />THEN (Convert(int, rand() * 100000) % 10000 * -1)<br />else Convert(int, Rand() * 100000) % 10000<br />end,<br />col3char(15) null default char((convert(int, rand() * 1000) % 26) + 65)<br />+ char((convert(int, rand() * 1000) % 26) + 65)<br />+ char((convert(int, rand() * 1000) % 26) + 65)<br />+ char((convert(int, rand() * 1000) % 26) + 65)<br />+ Replicate(char((convert(int, rand() * 1000) % 26) + 65) , 11)<br />)<br />declare @counter int<br />set @counter = 1<br />while (@counter &lt;= 10000)<br />begin <br />insert random_data default values<br />set @counter = @counter + 1<br />end<br /><br />insert into Random_data(col2, col3)<br />values (Default, null)<br /><br />create nonclustered index NCL_random_data on random_data(col3) --Create non clustered index to verify <br /> --if the index will be used in query<br /><br />declare @input char(15)<br /><br />select @input = NULL --'KVVDYYYYYYYYYYY'<br /><br />select col1, col3<br />from random_data<br />where ((col3 is null) and (@input is null)) or (col3 = @input)</pre><br /><br />This query uses index and does non clustered index seek on my machine.<br /><br /> |--Index Seek(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[master].[dbo].[Random_Data].[NCL_random_data]), SEEK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Random_Data].[col3]=[@input]) ORDERED FORWARD)<br /><br />Gaurav<br /><i>Moderator<br /><font size="1">Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard</i></font id="size1">
  12. nardi New Member

    Chappy's solution works but doesn't perform well. It makes table or index scan.

    Gaurav's solution performs well, but doesn't solve the problem, because a NULL argument means 'I don't want to filter on this column'.

    Twan solution is a good idea. However it is not so elegant (that's life!).
  13. gaurav_bindlish New Member

    But that's what you wanted right? ou don't want to filter on the columns for which a value has not been selected.

    Am I missing something here? Can you please explain.

    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard
  14. bambola New Member

    Here is what I had in mind

    declare @OrderID int
    declare @ShipPostalCode nvarchar(10)
    --select @ShipPostalCode = '50739'
    --select @OrderID = 10508

    select OrderID
    from northwind.db:confused:rders
    where
    (
    ( @OrderID IS NULL )
    OR
    ((@OrderID IS NOT NULL) AND (OrderID = @OrderID) )
    )
    AND
    (
    (@ShipPostalCode IS NULL)
    OR
    ((@ShipPostalCode IS NOT NULL) AND (ShipPostalCode = @ShipPostalCode) )
    )
    This is how it should (?) be working.
    If variable @OrderID IS NULL, it should move to check the next condition - @OrderID IS NOT NULL - which would be false, and exit this AND condition since false AND whatever is false. In this case it should avoid evaluating the last expression - OrderID = @OrderID, which is what I was trying to achieve. I still have to figure out if it is actually evaluating it or not.

    If @OrderID IS NOT NULL, it will still check the second condition since they are in OR, @OrderID IS NOT NULL would evaluate to true, and it should check OrderID = @OrderID and return corrisponding rows.

    So far so good, and it can be a solution for these kind of problem if I could get it to work like I want to :-P

    But... Both OrderID and ShipPostalCode are indexed. But no matter what combination of values I use

    select @ShipPostalCode = '50739'
    select @OrderID = NULL

    select @ShipPostalCode = NULL
    select @OrderID = 10508

    select @ShipPostalCode = '50739'
    select @OrderID = 10508

    it always does index scan. It's better than a table scan but I would have liked to know - is there anyway to see here index seek???


    Gaurav - I have tried the code bellow and i still get index scan. What do you get when you run it?

    select OrderID
    from northwind.db:confused:rders
    where
    ((@OrderID is null) or OrderID = @OrderID)
    and
    ((@ShipPostalCode is null) or ShipPostalCode = @ShipPostalCode)

    Bambola.

    If we are to achieve results never before accomplished, we must expect to employ methods never before attempted. - Francis Bacon
  15. nardi New Member

    Gaurav,
    Using the table in your example, all rows should be returned when @input is null, meaning 'I don't care about col3 values'. However, due to 'col3 is null' expression, when @input is null, just the rows with col3 = NULL are returned.

    In your example, try this to see what I mean:

    select @input = NULL

    Your statement:
    select col1, col3
    from random_data
    where ((col3 is null) and (@input is null)) or (col3 = @input)

    returns just 1 row


    My statement:
    select col1, col3
    from random_data
    where ((@input is null)) or (col3 = @input)

    returns all rows, that is what I want, but does index scan.
  16. gaurav_bindlish New Member

    I ran the query for 10000 records. The northwind database has 830 rows.

    Also the ShipPostalCode does not have any index. So I guess the query optimizer has chosen the table scan as the best option.

    My case had non clustered index for the search column.

    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard
  17. gaurav_bindlish New Member

    Now we are getting where the 3 way behaviour of NULL is taking a toll on us. See the think is that you have to consider NULL as a data value or a empty data. If you consider it as a data value, the requirements can't be met using my method. But if you use it as an empty value, then u'll have to convert all NULL's present in the database into some default value which will remove this confusion.


    NULL is one thing in SQL Server which can introduce many bugs and make the life of programmer HELL.

    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard

Share This Page