SQL Server Performance

I really need an efficient way to do this...

Discussion in 'General Developer Questions' started by airjrdn, Jul 27, 2005.

  1. airjrdn New Member

    I'd like some input on potentially more efficient ways of doing this. It seems overly complicated and I'm hoping there's a better way.

    In this code, the following represents:
    ---------------------------------------
    Product = Table with products for sale
    ProductPrice = Table with Prices for the products
    * Note that the same product purchased at different times may have different prices
    ** Note that the product may have multiple price entries for the same EffDate - newest UpdateDate should be chosen
    *** Since products could have been priced when the first entry for an EffDate was in effect, those records must be retained



    [other code here]
    from Product p
    Inner Join ProductPrice pp On p.a = pp.a
    and p.b = pp.b
    and p.c = pp.c
    and p.PurchaseDate >= pp.EffDate-- Price must have been in effect at this purchase date
    and pp.EffDate = (SelectMax(EffDate)-- Newest price that was in effect for this product
    FromProductPrice pp2
    Wherep.a = pp2.a
    andp.b = pp2.b
    andp.c = pp2.c
    andpp2.EffDate <= p.PurchaseDate)
    and pp.UpdateDate = (SelectMax(UpdateDate)-- Newest price with that effective date
    FromProductPrice pp2
    Wherep.a = pp2.a
    andp.b = pp2.b
    andp.c = pp2.c
    andpp2.EffDate = pp.EffDate)

  2. Madhivanan Moderator

    Posting table structures, sample data and expected result would be helpful to give the solution


    Madhivanan

    Failing to plan is Planning to fail
  3. FrankKalis Moderator


    USE Northwind
    SELECT t1.OrderID, t1.Quantity FROM [order details] t1
    WHERE t1.Quantity=
    (SELECT MAX(Quantity) FROM [order details] t2
    WHERE t1.orderid=t2.orderid)
    ORDER BY t1.orderid

    SELECT t1.* FROM [order details] t1 INNER JOIN
    (SELECT orderid, MAX(Quantity) AS maxdate FROM [order details] GROUP BY orderid) t2
    ON t1.orderid = t2.orderid
    AND t1.Quantity = t2.maxdate
    ORDER BY t1.orderid

    are two other alternatives. Are you experienceing any performance issues with your query?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Ich unterstütze PASS Deutschland e.V. http://www.sqlpass.de)
  4. airjrdn New Member

    I'm pretty reluctant to post structures and actual code without getting prior approval. I tried as best I could to post very similar code though.

    As for performance issues, yes. The process this runs within takes around 12 hours to run. Granted, we have a large # of records, but this is an integral part of the overall process.

    We process just under a million records a day. For each of these records, we currently run through this process. For business reasons, we may end up having to run through this same process multiple times per record.

    I'm contemplating a process that would pre-build a table, having already determined the most recently updated record per date, then simply doing a 1 to 1 join to that table. While I think this would be the most efficient, I was wondering if someone might have stumbled across something in the past that would be a more efficient way of accomplishing this.
  5. Adriaan New Member

    One of the problems may be that your query's resultset contains duplicate lines because of the following line in your JOIN definition: p.PurchaseDate >= pp.EffDate. If you're suppressing the duplicate lines with a DISTINCT clause, this will slow down your query. From what I can tell, you should be able to evaluate this bit in an EXISTS clause, which doesn't produce duplicate rows.

    I would also use derived tables to give you the MAX(EffDate) and MAX(UpdateDate) - use GROUP BY on the columns on which you will link the main table to each derived table.
  6. airjrdn New Member

    They aren't being suppressed with a distinct, the "and pp.UpdateDate = (SelectMax(UpdateDate)" line handles that.<br /><br />The realism is that we will get records today stating that the same product was purchased on multiple dates. The price may be different depending on the date. We need the latest price that was in effect when the product was purchased.<br /><br />Clear as mud isn't it? <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br />
  7. Adriaan New Member

    Okay, let's do this one step at a time.

    You need a query that tells you, for each actual purchase date of each product, what the most recent price was at that time.

    Assuming that the product is defined by columns a, b and c, this would be my guess:

    SELECT Product.a, Product.b, Product.c, Product.PurchaseDate,
    --correlated subquery - start
    (SELECT TOP 1 pp.Price FROM ProductPrice pp
    WHERE pp.a = Product.a AND pp.b = Product.b AND pp.c = Product.c
    AND pp.EffDate <= Product.PurchaseDate ORDER BY pp.EffDate DESC, pp.UpdateDate DESC) AS Effective_Price
    --correlated subquery - end
    FROM Product

    Let's look at that subquery to see if I got it right:

    SELECT TOP 1 pp.Price FROM ProductPrice pp
    WHERE pp.a = Product.a AND pp.b = Product.b AND pp.c = Product.c
    AND pp.EffDate <= Product.PurchaseDate
    ORDER BY pp.EffDate DESC, pp.UpdateDate DESC

    The "TOP 1" clauses should make sure I have only one return value.
    The "WHERE" clause makes sure it's
    (1) for the same product (a, b and c columns)
    (2) for EffDate on or before the PurchaseDate
    The "ORDER BY" clause makes sure it's
    (1) for the most recent EffDate on or before the PurchaseDate
    (2) for the most recently updated record for the most recent EffDate
  8. airjrdn New Member

    I think you've got it right. Let me put some code together to test it for performance.

    Edit...

    One thing not accounted for...the updatedate.

    There could be two entries w/the same EffectiveDate. We use whichever one was updated (inserted really) most recently.

    So it's the newest effective date, and if there's more than one, it's the one with the newest updatedate.
  9. Adriaan New Member

    No, the UpdateDate IS being accounted for, as we ORDER BY both EffectiveDate DESC and UpdateDate DESC, in that order.

    So if you happen to have two entries with the same EffectiveDate at the top of the list, the one with the most recent UpdateDate comes out on top.

    Is it possible to have two entries with the same EffectiveDate and UpdateDate? Then you need to add a third column to the ORDER BY clause to break the tie.
  10. airjrdn New Member

    No, there shouldn't be a conflict w/the update date. I'll give this a whirl.
  11. airjrdn New Member

    Sorry for the delay, I just now had a chance to give this a shot.

    The table in the original code that's equivalent to ProductPrice here is referenced 45 (yeah, 45) times in the original procedure in the select. Obviously, it wouldn't be more efficient to do the subselect that many times. I tried joining to it as a derived table, but in a derived table, you can't join back out to the outer tables. Meaning this won't work:

    SELECT Product.a, Product.b, Product.c, Product.PurchaseDate, PurchPrice.Price
    FROM Product
    INNER JOIN (SELECT TOP 1 pp.Price FROM ProductPrice pp
    WHERE pp.a = Product.a AND pp.b = Product.b AND pp.c = Product.c
    AND pp.EffDate <= Product.PurchaseDate ORDER BY pp.EffDate DESC, pp.UpdateDate DESC) PurchPrice

    Is there any other way to make it work that you can think of?

  12. FrankKalis Moderator

    I might be missing something, but have you noticed the second SELECT I've posted in my first reply?


    SELECT t1.* FROM [order details] t1
    INNER JOIN
    (SELECT orderid, MAX(Quantity) AS maxdate
    FROM [order details]
    GROUP BY orderid) t2 ON t1.orderid = t2.orderid
    AND t1.Quantity = t2.maxdate ORDER BY t1.orderid

    Isn't that applicable to what you need?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Ich unterstütze PASS Deutschland e.V. http://www.sqlpass.de)
  13. airjrdn New Member

    Trying now...
  14. Adriaan New Member

    I believe that in SQL 2005 you will be able to create a "derived table" at the start of your stored procedure, and then access that as if it were a regular table.

    Of course the same can already be accomplished with a temp table that holds "most recent Price per EffDate per product" like so:

    CREATE TABLE #TMP (a VARCHAR(5), b. VARCHAR(5), c VARCHAR(5), EffDate DATETIME, Price FLOAT)
    -- You may want to add some indexing here ...

    INSERT INTO #TMP
    SELECT pp.a, pp.b, pp.c, pp.EffDate, pp.Price
    FROM ProductPrice pp
    WHERE NOT EXISTS
    (SELECT * FROM ProductPrice pp2
    WHERE pp2.a = pp.a AND pp2.b = pp.b AND pp2.c = pp.c
    AND pp2.EffDate = pp.EffDate AND pp2.UpdateDate > pp.UpdateDate)

    -- (1) Cases where PurchaseDate matches an EffDate
    SELECT p.a, p.b, p.c, p.PurchaseDate, t.Price
    FROM Product p
    INNER JOIN #TMP ON p.a = #TMP.a AND p.b = #TMP.b AND p.c = #TMP.c
    AND p.PurchaseDate = #TMP.EffDate
    -- (2) cases without matching EffDate: we match on latest EffDate before PurchaseDate
    UNION ALL
    SELECT p.a, p.b, p.c, p.PurchaseDate, t1.Price
    FROM Product p
    INNER JOIN #TMP t1 ON p.a = t1.a AND p.b = t1.b AND p.c = t1.c
    WHERE NOT EXISTS
    (SELECT t2.a FROM #TMP t2
    WHERE p.a = t2.a AND p.b = t2.b AND p.c = t2.c AND p.PurchaseDate = t2.PurchaseDate)
    AND t1.EffDate =
    (SELECT MAX(t3.EffDate) FROM #TMP t3
    WHERE p.a = t3.a AND p.b = t3.b AND p.c = t3.c AND t3.EffDate < p.PurchaseDate)
  15. airjrdn New Member

    That's actually what I spent some time doing late last week.

    Before running the select, I pre-built the table with the latest updatedate for each effective date. On a join condition with relatively few rows, it took 23% of the time while the original code took 77% of the time. However, when the join conditions changed, it took 47% while the original took 53%.

    I've got an idea I'd like to try for a clustered index, but given the current processes that are running, I can't really try it right now.

    FrankKalis - I'm running the version similar to what you posted now.
  16. Adriaan New Member

    Not sure what you mean by "when the join conditions changed" - the join statement is static?

    If you're comparing the response time of queries in a batch, you should clear out buffers before, inbetween and after to get an accurate timing. This can be done with some DBCC commands that hopefully someone will add to this thread ... and you probably should not do that on a production server.
  17. airjrdn New Member

    I got permission to post the actual code. Given the formatting, I wasn't sure how to make it look "nice" in here, so I simply pasted it within a code block. Note that even this is a stripped down version, the actual code is updating these fields in a table. I've converted it to a select for testing purposes.

    Also, there are of course other tables, etc. being used here, but this should give you a better idea of what I'm working with.

    Does this help?



    selectCDRArchive.CDRRecordID,
    AdminRateCodeNew =Case
    When AdminRateCode.MsgRateAmt Is Not Null and AdminRateCode.MinRateAmt Is Not Null
    Then Case AdminRateCode.MsgRateTypeCodeID
    When 1 Then (AdminRateCode.MsgRateAmt/CDRArchive.BMOU)
    When 2 Then ((AdminRateCode.MsgRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))/CDRArchive.BMOU)
    Else 0
    End + Case AdminRateCode.MinRateTypeCodeID
    When 1 Then AdminRateCode.MinRateAmt
    When 2 Then (AdminRateCode.MinRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))
    Else 0
    End
    When AdminRateCode.MsgRateAmt Is Null and AdminRateCode.MinRateAmt Is Not Null
    ThenCase AdminRateCode.MinRateTypeCodeID
    When 1 Then AdminRateCode.MinRateAmt
    When 2 Then (AdminRateCode.MinRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))
    Else 0
    End
    When AdminRateCode.MsgRateAmt Is Not Null and AdminRateCode.MinRateAmt Is Null
    ThenCase AdminRateCode.MsgRateTypeCodeID
    When 1 Then (AdminRateCode.MsgRateAmt/CDRArchive.BMOU)
    When 2 Then ((AdminRateCode.MsgRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))/CDRArchive.BMOU)
    Else 0
    End
    Else Null
    End,
    ContractRateAmt4 =CaseWhen AdminRateCode.MsgConsortiumRateAmt Is Not Null and AdminRateCode.MinConsortiumRateAmt Is Not Null
    ThenCase AdminRateCode.MsgConsortiumRateTypeCodeID
    When 1 Then (AdminRateCode.MsgConsortiumRateAmt/CDRArchive.BMOU)
    When 2 Then ((AdminRateCode.MsgConsortiumRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))/CDRArchive.BMOU)
    Else 0
    End +Case AdminRateCode.MinConsortiumRateTypeCodeID
    When 1 Then AdminRateCode.MinConsortiumRateAmt
    When 2 Then (AdminRateCode.MinConsortiumRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))
    Else 0
    End
    When AdminRateCode.MsgConsortiumRateAmt Is Null and AdminRateCode.MinConsortiumRateAmt Is Not Null
    ThenCase AdminRateCode.MinConsortiumRateTypeCodeID
    When 1 Then AdminRateCode.MinConsortiumRateAmt
    When 2 Then (AdminRateCode.MinConsortiumRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))
    Else 0
    End
    When AdminRateCode.MsgConsortiumRateAmt Is Not Null and AdminRateCode.MinConsortiumRateAmt Is Null
    ThenCase AdminRateCode.MsgConsortiumRateTypeCodeID
    When 1 Then (AdminRateCode.MsgConsortiumRateAmt/CDRArchive.BMOU)
    When 2 Then ((AdminRateCode.MsgConsortiumRateAmt * (CDRArchive.CDRNoSurCostAmt/CDRArchive.BMOU))/CDRArchive.BMOU)
    Else 0
    End
    Else Null
    End,
    ContractRateCodeID4 = Case When AdminRateCode.MsgConsortiumRateAmt Is Not Null Or AdminRateCode.MinConsortiumRateAmt Is Not Null
    Then AdminRateCode.AdminRateCodeID
    Else Null
    End,
    CDRStatsUpdateFlag = Left(Cast(AdminRateCode.RateApplyCodeID As Varchar(10)), 1),
    ContractRateCodeID5 = Case When AdminRateCode.FixedInvoiceRateAmt Is Not Null Then AdminRateCode.AdminRateCodeID Else Null End,
    ContractRateAmt5 = AdminRateCode.FixedInvoiceRateAmt,
    ContractCostAmt5 = AdminRateCode.FixedInvoiceRateAmt * CDRArchive.BMOU,
    AdminRateCodeNewID = AdminRateCode.AdminRateCodeID
    intoDefaultDB..QwestRateTest_Orig_Version
    FromQwestTieredMonthly.dbo.tbCDRMonthly CDRArchive-- try a clustered index on CarrierID, ProductID, JurisdictionID, ProductTypeCodeID, DACodeID
    --FromQwestMonthly.dbo.tbCDRMonthly_200506 CDRArchive
    Inner Join CDRRating.dbo.tbCarrierProductCode CarrierProductCode OnCDRArchive.CarrierID = CarrierProductCode.CarrierID
    AndCDRArchive.ProductID = CarrierProductCode.ProductID
    AndCDRArchive.JurisdictionID = CarrierProductCode.JurisdictionID
    AndCDRArchive.ProductTypeCodeID = CarrierProductCode.ProductTypeCodeID
    AndCDRArchive.DACodeID = CarrierProductCode.DACodeID
    InnerJoin ANPICorporate.dbo.tbState StateCode OnCDRArchive.StateCalling = StateCode.StateCode
    InnerJoin CDRRating.dbo.tbAdminMethod AdminMethod OnCarrierProductCode.CarrierProductCodeID = AdminMethod.CarrierProductCodeID
    InnerJoin CDRRating.dbo.tbAdminRateCode AdminRateCode OnAdminMethod.AdminMethodID = AdminRateCode.AdminMethodID
    AndCDRArchive.MemberID = AdminRateCode.MemberID
    AndStateCode.StateCodeID = AdminRateCode.OrigStateCodeID
    AndCDRArchive.OrigOCN = AdminRateCode.OrigOCNCode
    AndCDRArchive.CallDate >= AdminRateCode.EffectiveDate
    AndAdminRateCode.EffectiveDate = (SelectMax(EffectiveDate)
    FromCDRRating.dbo.tbAdminRateCode z
    Wherez.AdminMethodID = AdminRateCode.AdminMethodID
    Andz.MemberID = AdminRateCode.MemberID
    Andz.OrigStateCodeID = AdminRateCode.OrigStateCodeID
    Andz.OrigOCNCode = AdminRateCode.OrigOCNCode
    Andz.EffectiveDate <= CDRArchive.CallDate)
    AndAdminRateCode.UpdateDate = (SelectMax(UpdateDate)
    FromCDRRating.dbo.tbAdminRateCode x
    Wherex.AdminMethodID = AdminRateCode.AdminMethodID
    Andx.MemberID = AdminRateCode.MemberID
    Andx.OrigStateCodeID = AdminRateCode.OrigStateCodeID
    Andx.OrigOCNCode = AdminRateCode.OrigOCNCode
    Andx.EffectiveDate = AdminRateCode.EffectiveDate)
    WhereCarrierProductCode.CarrierID = 12
    --WhereCarrierProductCode.CarrierID = 8
    AndAdminRateCode.AdminMethodID In (SelectDistinct AdminMethodID
    FromCDRRating.dbo.tbAdminRateCode AdminRateCode2
    WhereAdminRateCode2.OrigNPA Is Null
    AndAdminRateCode2.OrigNXX Is Null
    AndAdminRateCode2.OrigStateCodeID Is Not Null
    AndAdminRateCode2.OrigLATACodeID Is Null
    AndAdminRateCode2.OrigOCNCode Is Not Null
    --AndAdminRateCode2.OrigOCNCode Is Null
    AndAdminRateCode2.OrigTier Is Null
    AndAdminRateCode2.TermNPA Is Null
    AndAdminRateCode2.TermNXX Is Null
    AndAdminRateCode2.TermStateCodeID Is Null
    AndAdminRateCode2.TermLATACodeID Is Null
    AndAdminRateCode2.TermOCNCode Is Null
    AndAdminRateCode2.TermTier Is Null)
    AndAdminRateCode.EffectiveDate = (SelectMax(AdminRateCodez.EffectiveDate)
    FromCDRRating.dbo.tbRateSchedule RS
    InnerJoin CDRRating.dbo.tbAdminMethod AM
    OnRS.RateScheduleID = AM.RateScheduleID
    InnerJoin ANPICorporate.dbo.tbState SC
    OnCDRArchive.StateCalling = SC.StateCode
    InnerJoin CDRRating.dbo.tbAdminRateCode AdminRateCodez
    OnAM.AdminMethodID = AdminRateCodez.AdminMethodID
    AndCDRArchive.MemberID = AdminRateCodez.MemberID
    AndSC.StateCodeID = AdminRateCodez.OrigStateCodeID
    AndCDRArchive.CallDate >= AdminRateCodez.EffectiveDate
    AndAdminRateCodez.EffectiveDate = (SelectMax(EffectiveDate)
    FromCDRRating.dbo.tbAdminRateCode z2
    Wherez2.AdminMethodID = AdminRateCodez.AdminMethodID
    Andz2.MemberID = AdminRateCodez.MemberID
    Andz2.OrigStateCodeID = AdminRateCodez.OrigStateCodeID
    Andz2.OrigOCNCode = AdminRateCodez.OrigOCNCode
    Andz2.EffectiveDate <= CDRArchive.CallDate)
    AndAdminRateCodez.UpdateDate = (SelectMax(UpdateDate)
    FromCDRRating.dbo.tbAdminRateCode x2
    Wherex2.AdminMethodID = AdminRateCodez.AdminMethodID
    Andx2.MemberID = AdminRateCodez.MemberID
    Andx2.OrigStateCodeID = AdminRateCodez.OrigStateCodeID
    Andx2.OrigOCNCode = AdminRateCodez.OrigOCNCode
    Andx2.EffectiveDate = AdminRateCodez.EffectiveDate)
    InnerJoin CDRRating.dbo.tbCarrierProductCode CPC
    OnAM.CarrierProductCodeID = CPC.CarrierProductCodeID
    WhereCPC.CarrierID = CDRArchive.CarrierID
    AndCPC.ProductID = CDRArchive.ProductID
    AndCPC.JurisdictionID = CDRArchive.JurisdictionID
    AndCPC.ProductTypeCodeID = CDRArchive.ProductTypeCodeID
    AndCPC.DACodeID = CDRArchive.DACodeID)


  18. Adriaan New Member

    quote:Originally posted by airjrdn

    Before running the select, I pre-built the table with the latest updatedate for each effective date.
    Not sure, but perhaps you read my post before the final edit: I replaced UpdateDate with EffDate in a few places in the script. You need EffDate, not UpdateDate - but of course you can copy the UpdateDate value into the #TMP table as well if you need it for the query further down.
  19. airjrdn New Member

    Sorry....

    Regarding the join conditions changing, there are multiple versions of this proc that are very similar. The only difference being that one version might join on OrigStateCodeID, and another version might join on OrigStateCodeID and OrigOCNCode.
  20. Adriaan New Member

    Looking at that query, I don't think the expressions in the SELECT list are too bad: for testing, if you replace them with a handful of column names then the response time should not really improve much.

    Again, I don't think you need to double-check on UpdateDate for EffDate, as long as you do that when compiling the aggregate table.

    A small issue is that you should delete the DISTINCT keyword from:
    WHERE AdminRateCode.AdminMethodID In (Select Distinct AdminMethodID ...
    A quicker of this subquery would of course be an EXISTS clause, but in any case DISTINCT only slows down the process and it is certainly not required.
  21. airjrdn New Member

    I've been out of the office the last couple of days, and will be out Monday as well, but I've made some progress testing things. Hopefully I'll have more useful information to post on Tuesday or Wednesday of next week.
  22. airjrdn New Member

    I apologize, I've been pulled off of this to push about 5 projects to completion by 8/31 and haven't had time to work on this since early this month.

Share This Page