Building a Generic Search SP | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Building a Generic Search SP

In my experience with SQL Server I have had to build various search forms in my Windows apps to allow general searching of a SQL database. I’ve never found a way that was both fast and produced "clean" code. I am looking for recommendations on how other developers have handled this type of problem. Sorry if this posting is a little verbose, but hopefully I can explain my issue clearly. For my current app I have a search interface to query customer orders. The search form has about 22 different fields that can be searched on. A lot of the fields are LIKE searches on varchar columns. Several of the fields are not very unique (i.e. status or priority, where 15-50 percent of the rows meet the criteria). The idea is to have a search form that has a lot of available fields to search by, but you don’t know which ones the user will fill in, and the search runs quickly for almost any combination entered. My current solution is to generate custom T-SQL within the search stored procedure. All of the search parameters are passed in to the SP. I take the core 3-4 tables I need to join and build a T-SQL command based on what fields were actually specified by the user. Basically, I have a bunch of IF statements testing each input parameter and building the SELECT and WHERE part of the search query in nvarchar variables. I then use sp_executesql to run the final custom query. The custom query builds a temp table full of the primary keys to all of the orders I want to return. I then use another SQL command to return the result fields. Thus, I separate the finding the orders from the selecting of the result fields based on those orders. This is partially because there are 7 different result "views" possible from the same search criteria, depending on what the user wants to see as the output. I also do the separation because there are 2-3 large tables that I don’t try to join in my custom query due to performance issues. I run the custom query first to get a smaller potential result set, and then I join that against the large tables one at a time. However, this means that I have two temp tables that I keep dumping results back and forth in between (imagine pouring water between two glasses repeatedly, each time pouring less water). I have to keep dumping results between two tables because I do not know at the start how many of the 3 large tables I have to join against (if any) in addition to the main custom query. My stored procedure runs fairly quickly. However, it is getting larger than I would like (400-500 lines now), and it is getting hard to understand and to mantain, particularly with so much custom SQL in varchar variables. Overall, it seems like it is getting kludgy. I would appreciate anyone who has suggestions or experience to share on how they’ve tackled this type of design problem in the past. Thanks,
Dan

Really diffifult to diagnose without seeing actual technical details, but ive had these sorts of problems in the past. Ive faced this sort of problem on numerous occasions and am facing the most complex one to date right this moment <img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /><br /><br />My first attempt I simply placed good covering indices on the most frequently used criteria. This is fine for some systems where although they can flexibly query anything, in this case as I ran a profile trace over the course of a month, I realised about only 5% of the search options were being used by the users. In this situation I was able to come up with a definitive set of indices which mean the system runs very fast 99% of the time. I review the indices from time to time but noone has ever complained it is slow. <br /><br />Moving onto the second case, a situation similar to yours I suspect, where vast amount of data was being queried on many fields, and each field was as equally used as the others. For this I did a solution a little similar to yours in that it used temporary tables and passed in a dynamic string. The string did not contain SQL, but contained meta data on what criteria they wanted. My stored procedure then parsed this string, and decided on a suitable attack strategy like this:<br /> Although each field could be queried, many fields are always less selective than others. The sp sorted the criteria into this predetermined order of selectivity. The initial temporary table was built by selecting data from the source table into a temporary table. The sp decided how to initially create this table based on whatever field was most selective, was being used in the users query, AND had a good index on it. So the temp table contains a result set which contains ALL valid records, but also contains many records which should not be in there. I then proceed to go through each of the other criteria, but rather than selecting the valid records into a new table, I delete the non valid records from the temporary table I just created. This wouldnt necessarily perform well for all situations but it performed very well for mine.<br /><br />The third case (the one im working on now), is very complex involving subqueries lots of dynamically decided joins etc and some queries nested sometimes up to 6 deep. Since this is a single user system, it probably wont be much help to you, but this system will monitor how a given user uses the system, and dynamically adjust its indexes over a period of time to ensure performance increases according to how a user uses the system (typically each user will use it in his own consistent but unique way). This utilises execution plans (and possibly will eventually utilise sql’s column statistics if i can access them). This however, is a vast amount of work and wouldnt work very well for multiuser systems.<br /><br />So Ive no real solution for you, but hope Ive given you some ideas.
I have done something similar myself. It was a search form with a lot of search parameters that was joining up to about 10 table. I spent a lot of time trying to build that query (executed with sp_executesql) the best way possible, first joining the tables that will reduce the result set as much as possible and filtering first the field with the best indexes. I did have to add SET QUERY_GOVERNOR_COST_LIMIT value_in_seconds for those queries that sql server estimate to run for too long though queries almost always run.
I was not very clear on what exactly are you doing with the 2 temporary tables. But if I understood you right, it it something that might be handled with correlated queries or maybe even in-line functions that except parameters and return a result set.
Any change you could post some code? Maybe of those parts you think are the more problematic. Bambola.
I have done something similar myself. It was a search form with a lot of search parameters that was joining up to about 10 table. I spent a lot of time trying to build that query (executed with sp_executesql) the best way possible, first joining the tables that will reduce the result set as much as possible and filtering first the field with the best indexes. I did have to add SET QUERY_GOVERNOR_COST_LIMIT value_in_seconds for those queries that sql server estimate to run for too long though queries almost always run.
I was not very clear on what exactly are you doing with the 2 temporary tables. But if I understood you right, it it something that might be handled with correlated queries or maybe even in-line functions that except parameters and return a result set.
Any change you could post some code? Maybe of those parts you think are the more problematic. Bambola.
I don’t know if this fits your problem solution or not… But I am refering to a solution which I had seen in a similar case. I had a user screen to fill in various details. The user could fill-in some details / select some values from drop down list / select some combo / check boxes etc. All the data from the screen was passed to a procedure and the data was processed something like the attached SP. I am taking an example of Order and Order Details Tables from Northwind Database. The functionality of the SP would be to get the user input in any variable and the procedure will return all rows related to that DROP PROCEDURE SEARCH_ORDER
GO
CREATE PROCEDURE SEARCH_ORDER
@OrderIDint = NULL,
@CustomerIDnchar(10) = NULL,
@EmployeeIDint = NULL,
@OrderDatedatetime = NULL,
@RequiredDatedatetime = NULL,
@ShippedDatedatetime = NULL,
@ShipViaint = NULL,
@Freightmoney = NULL,
@ShipNamenvarchar(80) = NULL,
@ShipAddressnvarchar(120) = NULL,
@ShipCitynvarchar(30) = NULL,
@ShipRegionnvarchar(30) = NULL,
@ShipPostalCodenvarchar(20) = NULL,
@ShipCountrynvarchar(30) = NULL,
@ProductIDint = NULL,
@UnitPricemoney = NULL,
@Quantitysmallint = NULL,
@Discountreal = NULL
AS SELECT
Orders.CustomerID,
Orders.EmployeeID,
Orders.Freight,
Orders.OrderDate,
Orders.OrderID,
Orders.RequiredDate,
Orders.ShipAddress,
Orders.ShipCity,
Orders.ShipCountry,
Orders.ShipName,
Orders.ShippedDate,
Orders.ShipPostalCode,
Orders.ShipRegion,
Orders.ShipVia,
[Order Details].Discount,
[Order Details].ProductID,
[Order Details].Quantity,
[Order Details].UnitPrice
FROM Orders
JOIN [Order Details]
ON Orders.OrderID = [Order Details].OrderID
WHEREOrders.CustomerID= ISNULL(@CustomerID, Orders.CustomerID)
ANDOrders.EmployeeID= ISNULL(@EmployeeID, Orders.EmployeeID)
ANDOrders.Freight= ISNULL(@Freight, Orders.Freight)
ANDOrders.OrderDate= ISNULL(@OrderDate, Orders.OrderDate)
ANDOrders.OrderID= ISNULL(@OrderID, Orders.OrderID)
ANDOrders.RequiredDate= ISNULL(@RequiredDate, Orders.RequiredDate)
ANDOrders.ShipAddress= ISNULL(@ShipAddress, Orders.ShipAddress)
ANDOrders.ShipCity= ISNULL(@ShipCity, Orders.ShipCity)
ANDOrders.ShipCountry= ISNULL(@ShipCountry, Orders.ShipCountry)
ANDOrders.ShipName= ISNULL(@ShipName, Orders.ShipName)
ANDOrders.ShippedDate= ISNULL(@ShippedDate, Orders.ShippedDate)
ANDOrders.ShipPostalCode= ISNULL(@ShipPostalCode, Orders.ShipPostalCode)
ANDISNULL(Orders.ShipRegion, ”)= ISNULL(ISNULL(@ShipRegion, Orders.ShipRegion),”)
ANDOrders.ShipVia= ISNULL(@ShipVia, Orders.ShipVia)
AND[Order Details].Discount= ISNULL(@Discount, [Order Details].Discount)
AND[Order Details].ProductID= ISNULL(@ProductID, [Order Details].ProductID)
AND[Order Details].Quantity= ISNULL(@Quantity, [Order Details].Quantity)
AND[Order Details].UnitPrice= ISNULL(@UnitPrice, [Order Details].UnitPrice) E.g. if you want to find any Order with
– OrderID 10248 execute SEARCH_ORDER @OrderID = ‘10248’
– OrderDate 1996-07-04 execute SEARCH_ORDER @OrderDate = ‘1996-07-04’ Here note that special processing needs to be done for the columns which are nullable e.g. Orders.ShipRegion. I haven’t done the same processing for all of these. This procedure may not be so well performing, but the functionality was met and both the users and developers were happy. Hope this Helps. Gaurav
Chappy, I have worked on something like your 3rd case before and I designed the following method:
– Run a profiler trace against the database during production hours. Make sure that the trace only captures long running queries. You can stop/start the trace automatically using sp_trace%.
– During off-hours, use the Index Tuning Wizard (ITW) to analyse the trace and let it recommend new indexes. You can run the ITW from the command line via the itwiz.exe utility so that you can schedule it (there is no documented COM interface for the ITW).
– Itwiz.exe lets you output the recommendations to a script file, which you can then run with osql to create the new indexes (and drop the old ones). I see a small problem in your case though: if it is a single user system the computer might be turned off outside production hours. Jacco
Hi. Thanks for the tips. Unfortunately I have several problems which mean my task is a lot more difficult… Single user system, the server is deployed locally on the users machine. I neither have nor want any admin priviledges over the server. Also, the sql server is actually MSDE, which means tools such as index tuning wizard and profiler are not available. As you can see, this means a lot of coding just to get the system to self optimise itself. Lots of fun though, and should hopefully be a powerful library when its done
Hi. Thanks for the tips. Unfortunately I have several problems which mean my task is a lot more difficult… Single user system, the server is deployed locally on the users machine. I neither have nor want any admin priviledges over the server. Also, the sql server is actually MSDE, which means tools such as index tuning wizard and profiler are not available. As you can see, this means a lot of coding just to get the system to self optimise itself. Lots of fun though, and should hopefully be a powerful library when its done
]]>