SQL Server Performance

Query runs slow with variables, fast with literals?????

Discussion in 'SQL Server 2005 T-SQL Performance Tuning' started by KepBoy, Feb 1, 2009.

  1. KepBoy New Member

    Hi, I know I've read about this issue somewhere but I can't find the article any more, hopefully someone knows and can help.
    Ok, I've got a inline table-valued function that accepts several parameters that are used in the WHERE clause of the SELECT. When I call the function with variables passed in the parameters the query basically does a table scan. If I run the same query with literals, with the same value, in place of the variables in runs far better. Can someone shed some light as to why this is? Using SQL Server 2005 Standard Edition, code is below.
    UDF:-
    ALTER FUNCTION private.ufxArchivedSheets(@grade VARCHAR(max), @sheetNumber VARCHAR(max),
    @orderNumber VARCHAR(max), @reelNumber INT, @palletNumber INT, @printedDate DATETIME) RETURNS TABLE
    RETURN (SELECT she.Grade, she.SheetNumber, she.OrderNumber AS ReelOrderNumber,
    she.ReelNumber, she.Status, plt.OrderNumber AS PalletOrderNumber,
    plt.PalletNumber, pls.Sequence, she.DatePrinted,
    she.SheetIndex, she.PrinterId
    FROM archive.Reel AS rel
    INNER JOIN archive.Sheet AS she
    LEFT OUTER JOIN archive.PalletSheet AS pls
    INNER JOIN archive.Pallet AS plt
    ON pls.PalletId = plt.PalletId
    ON she.SheetNumber = pls.SheetNumber
    ON rel.OrderNumber = she.OrderNumber
    AND rel.ReelNumber = she.ReelNumber
    WHERE (@reelNumber IS NOT NULL
    OR @palletNumber IS NOT NULL
    OR @printedDate IS NOT NULL
    OR @sheetNumber IS NOT NULL)
    AND (@reelNumber IS NULL
    OR (rel.OrderNumber = @orderNumber
    AND rel.ReelNumber = @reelNumber))
    AND (@palletNumber IS NULL
    OR (plt.OrderNumber = @orderNumber
    AND plt.PalletNumber = @palletNumber))
    AND (@printedDate IS NULL
    OR (she.DatePrinted >= @printedDate
    AND she.DatePrinted < DATEADD(d, 1, @printedDate)))
    AND (@sheetNumber IS NULL
    OR (she.SheetNumber = @sheetNumber
    AND she.Grade = @grade)))
    Query that runs fast:-
    SELECT * FROM private.ufxArchivedSheets(NULL, NULL, '1000354', NULL, 94, NULL)
    Query that runs slow:-
    DECLARE @orderNumber VARCHAR(max)
    DECLARE @palletNumber INT
    SELECT @palletNumber = 94, @orderNumber = '1000354'
    SELECT * FROM private.ufxArchivedSheets(NULL, NULL, @orderNumber, NULL, @palletNumber, NULL)
  2. Adriaan New Member

    Are those calls to the UDF the actual statements, or simplified? For instance if you're joining on a UDF, you have a big chance that the outer query does table scans instead of using indexes.
    Not sure that I understand why you would want to do this in a UDF. It's a straightforward query with a number of parameters, so why not simply keep the query as part of the stored procedure?
  3. KepBoy New Member

    Thanks for the responses.
    These are the actual statements that are running fast/slow respectively.
    I created a stored procedure with the same parameters and the SELECT statement in it but it runs slowly for both scenarios (literals and variables).
    As to why I want to do this in a user defined function, I read through the SQL docs and this is one of the suggestions for user defined functions, a parameterized view.
    The full story behind this is that I have 4 stored procedures which all return the same columns but accept different search criteria. The data is also stored in two destinct schemas, the archive and a replicate. The data in the replicate is transaction replication from our live database and the archive contains older data archived from the live db. For this reason the archive and the replicate can contain the same rows, because the data has been copied to the archive but cannot yet be removed from the live database.
    So, I have the udf that returns data from the archive, a udf that returns data from the replicate and the stored procedures that join/merge the two udf's.
    After analysing the stored procedures and udf's I found the slow querys that I originally posted. Hence my question. The database is set up this way because I wanted to reduce code duplication, but I'm open to code duplication to improve performance. However, I still can't understand why the udf runs fast with literals and slow with variables?
    I can post all the code if that would help.
    Kep.
  4. KepBoy New Member

    Ok, I'vechanged the udf and it works a lot faster. Still don't understand why it doesn't work in the first place though :eek:(
    CREATE FUNCTION private.ufxArchivedSheets(@grade VARCHAR(max), @sheetNumber VARCHAR(max),
    @orderNumber VARCHAR(max), @reelNumber INT, @palletNumber INT, @printedDate DATETIME)
    RETURNS @result TABLE (
    Grade CHAR(1) NOT NULL,
    SheetNumber CHAR(13) NOT NULL PRIMARY KEY,
    ReelOrderNumber VARCHAR(20) NOT NULL,
    ReelNumber INT NOT NULL,
    [Status] CHAR(1) NOT NULL,
    PalletOrderNumber VARCHAR(20) NULL,
    PalletNumber INT NULL,
    Sequence INT NULL,
    DatePrinted DATETIME NOT NULL,
    SheetIndex INT NOT NULL,
    PrinterId CHAR(1) NOT NULL
    )
    BEGIN
    IF @palletNumber IS NULL
    INSERT INTO @result
    (Grade, SheetNumber, ReelOrderNumber, ReelNumber, [Status],
    PalletOrderNumber, PalletNumber, Sequence, DatePrinted, SheetIndex,
    PrinterId)
    SELECT she.Grade, she.SheetNumber, she.OrderNumber AS ReelOrderNumber,
    she.ReelNumber, she.Status, plt.OrderNumber AS PalletOrderNumber,
    plt.PalletNumber, pls.Sequence, she.DatePrinted,
    she.SheetIndex, she.PrinterId
    FROM archive.Reel AS rel
    INNER JOIN archive.Sheet AS she
    LEFT OUTER JOIN archive.PalletSheet AS pls
    INNER JOIN archive.Pallet AS plt
    ON pls.PalletId = plt.PalletId
    ON she.SheetNumber = pls.SheetNumber
    ON rel.OrderNumber = she.OrderNumber
    AND rel.ReelNumber = she.ReelNumber
    WHERE (@reelNumber IS NOT NULL
    OR @printedDate IS NOT NULL
    OR @sheetNumber IS NOT NULL)
    AND (@reelNumber IS NULL
    OR (rel.OrderNumber = @orderNumber
    AND rel.ReelNumber = @reelNumber))
    AND (@palletNumber IS NULL
    OR (plt.OrderNumber = @orderNumber
    AND plt.PalletNumber = @palletNumber))
    AND (@printedDate IS NULL
    OR (she.DatePrinted >= @printedDate
    AND she.DatePrinted < DATEADD(d, 1, @printedDate)))
    AND (@sheetNumber IS NULL
    OR (she.SheetNumber = @sheetNumber
    AND she.Grade = @grade))
    ELSE
    INSERT INTO @result
    (Grade, SheetNumber, ReelOrderNumber, ReelNumber, [Status],
    PalletOrderNumber, PalletNumber, Sequence, DatePrinted, SheetIndex,
    PrinterId)
    SELECT she.Grade, she.SheetNumber, she.OrderNumber AS ReelOrderNumber,
    she.ReelNumber, she.Status, plt.OrderNumber AS PalletOrderNumber,
    plt.PalletNumber, pls.Sequence, she.DatePrinted,
    she.SheetIndex, she.PrinterId
    FROM archive.Reel AS rel
    INNER JOIN archive.Sheet AS she
    LEFT OUTER JOIN archive.PalletSheet AS pls
    INNER JOIN archive.Pallet AS plt
    ON pls.PalletId = plt.PalletId
    ON she.SheetNumber = pls.SheetNumber
    ON rel.OrderNumber = she.OrderNumber
    AND rel.ReelNumber = she.ReelNumber
    WHERE (@reelNumber IS NOT NULL
    OR @palletNumber IS NOT NULL
    OR @printedDate IS NOT NULL
    OR @sheetNumber IS NOT NULL)
    AND (@reelNumber IS NULL
    OR (rel.OrderNumber = @orderNumber
    AND rel.ReelNumber = @reelNumber))
    AND (plt.OrderNumber = @orderNumber
    AND plt.PalletNumber = @palletNumber)
    AND (@printedDate IS NULL
    OR (she.DatePrinted >= @printedDate
    AND she.DatePrinted < DATEADD(d, 1, @printedDate)))
    AND (@sheetNumber IS NULL
    OR (she.SheetNumber = @sheetNumber
    AND she.Grade = @grade))
    RETURN
    END
    GO
  5. Adriaan New Member

    Indeed this is one of the applications of a UDF, but that is not to say that a UDF is your best option for the functionality.
    If you have basically the same query but with lots of variation in the criteria between calls, then dynamic SQL, using sp_ExecuteSQL with proper parameter declaration but only for the criteria actually applied, can often help improve performance because it will create execution plans that can be reused.
  6. satya Moderator

    In addtion to what Adrian referred I see that this may have to do with the fact that the initial compilation unit is the whole batch (as opposed to the statement).
    The main part here is the optimizer to compile a plan based on the parameters and WHERE clause that you have used. To explain this further SQL Server expands the definitions of of the UDFs, substituting the parameter values with the specified constants, and the optimizer will
    optimize the expanded query with knowledge of the values in the filter.
    Again I would like to refer what Adrian says that make this witha stored procedure that in order to get this batch gets optimized upon first invocation AFTER the value was passed through the parameter. The better understanding for the SQL OPTIMIZER about what the value of the variable is when it optimizes the batch. This solution would generate an optimal plan for the query based on the value provided in the first invocation, and would reuse the plan in subsequent invocations of the stored procedure.
    Hope this helps.
  7. FrankKalis Moderator

    Probably this might be worth reading: http://groups.google.ch/group/microsoft.public.sqlserver.programming/msg/1e4a2438bed08aca

Share This Page