SQL Server Performance Forum – Threads Archive

Using Maxdop…

I have an insert query that is running for 6 hours. i have 6 processors running in parallel. The two tables in the left outer join has around 5.2 million records.
The table that is on the right hand side of the left outer join (BITempProductStoreCostPrice) has a clustered index on
SupplierSeriesNumber,ProductID, StoreID, StartDateID and EndDateID desc will enforcing maxdop = 1 affect the performance?
is there anyway i can make the query run below 6 hours?
My Query…
INSERT
INTO dbo.BITempProductStoreCostPrice (
SupplierSeriesNumber,
ProductID,
UnitsPerTray,
StoreID,
StartDateID,
EndDateID,
PaymentCurrencyID,
PaymentExchangeRate,
PaymentExchangeRateTypeID,
TrayCostPrice,
WeightCostPrice,
TrayDutyCharge,
TrayExciseCharge,
VATPercentage,
VatablePercentage,
CostPriceAuthorisationFlag,
CostPriceTypeCode,
InheritedPriceCode)
SELECT scpe.SupplierSeriesNumber,
scpe.ProductID,
scpe.UnitsPerTray,
scpe.StoreID,
scpe.StartDateID,
scpe.EndDateID,
scpe.PaymentCurrencyID,
scpe.PaymentExchangeRate,
scpe.PaymentExchangeRateTypeID,
scpe.TrayCostPrice,
scpe.WeightCostPrice,
scpe.TrayDutyCharge,
scpe.TrayExciseCharge,
scpe.VATPercentage,
scpe.VatablePercentage,
scpe.CostPriceAuthorisationFlag,
scpe.CostPriceTypeCode,
scpe.InheritedPriceCode
FROM dbo.BITempProductStoreCostPriceException scpe
LEFT OUTER JOIN (
SELECT SupplierSeriesNumber,
ProductID,
StoreID,
MIN(StartDateID) StartDateID,
RangeEndDateID EndDateID
FROM dbo.BITempProductStoreCostPrice
GROUP BY
SupplierSeriesNumber,
ProductID,
StoreID,
RangeEndDateID) scp
ON scpe.SupplierSeriesNumber = scp.SupplierSeriesNumber
AND scpe.ProductID = scp.ProductID
AND scpe.StoreID = scp.StoreID
AND scpe.StartDateID BETWEEN scp.StartDateID AND scp.EndDateID
AND scpe.EndDateID BETWEEN scp.StartDateID AND scp.EndDateID
WHERE scpe.InheritedPriceCode = CASE 1 WHEN 1 THEN ‘D’ WHEN 2 THEN ‘CD’ WHEN 3 THEN ‘C’ END
AND scp.ProductID IS NULL
Thanks,
Ram "It is easy to write code for a spec and walk in water, provided, both are freezed…"

Maxdop = 1 is used to tell the Query Optimizer to not use parallelism in your query.
"1" will set one CPU to be used to run the query So, Maxdop = 1 can improve the performance

What is the purpose of the expression CASE 1 WHEN 1 THEN ‘D’ WHEN 2 THEN ‘CD’ WHEN 3 THEN ‘C’ END?
Do you have an index on inheritedPriceCode and productID?
Instead of left join with group by I would also try to use correlated query returning max(startDateID) from exception table. If small enough number of rows from exception table is returned that might be faster.
About paralelism: I don’t know if avoiding paralelism is going to help. You can test it and see what happens.

Also, instead of scpe.InheritedPriceCode = long expression use scpe.InheritedPriceCode = ‘D’.

You might also try replacing the derived table with a NOT EXISTS subquery.

Hi Marovic, The exception table contains 5.2 million rows – will correlated query work faster in this case? I dont have any index on inheritedPriceCode and productID – will my query speed up if i create an index? The actual case statement looks as below… sorry to give you a wrong one before. CASE @vCount WHEN 1 THEN ‘D’ WHEN 2 THEN ‘CD’ WHEN 3 THEN ‘C’ END? Thanks,
Ram "It is easy to write code for a spec and walk in water, provided, both are freezed…"

quote:Originally posted by ramkumar.mu Hi Marovic, The exception table contains 5.2 million rows – will correlated query work faster in this case?

It doesn’t matter how many rows are in the table. It depends on how many rows will be returned when where condition is applied.

quote:
I dont have any index on inheritedPriceCode and productID – will my query speed up if i create an index?

I made an mistake, you need only an index on inheritedPriceCode.

quote:The actual case statement looks as below… sorry to give you a wrong one before. CASE @vCount WHEN 1 THEN ‘D’ WHEN 2 THEN ‘CD’ WHEN 3 THEN ‘C’ END?

Query optimizer might be confused with such construct and not choose index on inheritedPriceCode even when it would be usefull. If there are just 3 values of inheritedPriceCode and there is not value that will return less then 20% of rows index won’t be used because table or clustered index scan is faster in that case. If there is a value in 10-20% of rows index might not be used by query analyzer but you could try to force it and see if there is performance improvement. If there are less then 10% rows with specific value index should help. Finally, if 100,000s rows are returned, what is the purpose of that query? User is going to review less then 1% of such record set anyway.

Hi adriaan, can you please tell me how to do that? Thanks,
Ram "It is easy to write code for a spec and walk in water, provided, both are freezed…"

quote:Finally, if 100,000s rows are returned, what is the purpose of that query? User is going to review less then 1% of such record set anyway.

Sorry, I overlooked it is an insert query. If there is a value that can benefit from the index, I would split inserts so case expression doesn’t have to be used. Also, consider splitting inserts to smaller batches because of transaction log growth issue.

run the query without the insert, just the select portion
do this from QA on the server itself, so there is no network how long does just the select take, for the full insert query, run perfmon, watching individual cpu, and the disk counters (see www.qdpma.com for which disk counters), look at the execution plan for the full insert and just the select portion,
describe the main components.
or better yet, post the SHOWPLAN_ALL output

Ram, You use the same derived table, except you move it to the WHERE clause and you change the join columns to a WHERE clause in the subquery – FROM dbo.BITempProductStoreCostPriceException scpe
WHERE scpe.InheritedPriceCode = CASE 1 WHEN 1 THEN ‘D’ WHEN 2 THEN ‘CD’ WHEN 3 THEN ‘C’ END
AND NOT EXISTS
(SELECT scp.SupplierSeriesNumber
FROM dbo.BITempProductStoreCostPrice scp
WHERE scpe.SupplierSeriesNumber = scp.SupplierSeriesNumber
AND scpe.ProductID = scp.ProductID AND scpe.StoreID = scp.StoreID
AND scpe.StartDateID BETWEEN scp.StartDateID AND scp.EndDateID
AND scpe.EndDateID BETWEEN scp.StartDateID AND scp.EndDateID) – but I’m not quite sure what you’re doing with the MIN() aggregate, combined with the BETWEEN criteria. Funny to see "ID" after a date column.

THanks adriaan. The ID at the end of date column is for calculation purposes. those columns are varchar and not date columns. also, these tables are just staging tables for tranformation. at the end of staging, those id columns would be converted to datetime datatype. Thanks,
Ram "It is easy to write code for a spec and walk in water, provided, both are freezed…"

Do the staging tables have indexes?

Have you tried turning parallelism off?
http://support.microsoft.com/?kbid=837983 fyi.
However, there is a threshold below which a parallel plan will not be considered. By default, this "cost threshold for parallelism" is set to 5. With this setting, SQL-Server will not consider a parallel plan if the estimated execution time of the serial plan is less than 5 seconds. Satya SKJ
Microsoft SQL Server MVP
Contributing Editor & Forums Moderator
http://www.SQL-Server-Performance.Com
This posting is provided â€œAS ISâ€ with no rights for the sake of knowledge sharing.

Adriaan, The staging table BITempProductStoreCostPrice has a clustered index on the join columns that is used in the query. Satya, I tried turning off the parallelism and create a nonclustered index on the group by columns of the
BITempProductStoreCostPrice table. the query ran for 3 days and I killed it manually. Thanks,
Ram "It is easy to write code for a spec and walk in water, provided, both are freezed…"

]]>