This query is killing my 4 CPU's(100%) | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

This query is killing my 4 CPU’s(100%)

Hi,
I have a query that is killing my 4 CPU production server. For 10 minutes the CPU’s are going to 100%. The query is running against a single table(tran_tb) with 20 million rows, is using a single date column(tdt) and is performing a Nested Loop betwenn 3 million rows and 0.5 million rows.
The query was running fine but the developers didn’t considered the table growth which got now to 20 millions rows. This is the query :
select DateWindow.CurrTo,
Count(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.CurrFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.CurrTo then dbo.tran_tb.tran_id end) CurrCount,
Count(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.PriorFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.PriorTo then dbo.tran_tb.tran_id end) PriorCount
from
(Select distinct substring(convert(char,dbo.tran_tb.tdt-7,20),1,13) CurrFrom,
substring(convert(char,dbo.tran_tb.tdt,20),1,13) CurrTo,
substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt-7),20),1,13) PriorFrom,
substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt),20),1,13) PriorTo
from dbo.tran_tb
where
substring(convert(char,dbo.tran_tb.tdt,20),1,13) > substring(convert(char,getdate()-1,20),1,13) and
substring(convert(char,dbo.tran_tb.tdt,20),1,13) <= substring(convert(char,getdate(),20),1,13) and
dbo.tran_tb.tdt between DATEADD(HH,-2,getdate()-1) and getdate()) as DateWindow,
dbo.tran_tb where
dbo.tran_tb.tdt between DATEADD(HH,-2,getdate()-9) and getdate()
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.PriorFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.CurrTo
group by DateWindow.CurrTo I would really appreciate any ideas to improve the performance of this query. Best Regards

I think there are 2 major problems.
1. SQL Server does not generate actual binary code, so all of the convert & dateadd and substring ops are very expensive, especially is 20M rows must be processes and that the same value must be computed several times (i am not sure if it can remove duplicate function calls with the same arguments). 2. it looks you have almost duplicate SARG, one "tdt between …" so it can use an index, and a second of the form fn(tdt) > fn(getdate()), which is the real SARG, but cannot use an index seek. I would declare 2 datetime variables to precomputer
substring(convert(char,getdate()-1,20),1,13)
and substring(convert(char,getdate(),20),1,13)
but this alone will not gain much for you. I think the real solution is to add 4 or more datetime (small?) columns that stores the values rounded below converted back to datetime
substring(convert(char,dbo.tran_tb.tdt,20),1,13)
substring(convert(char,dbo.tran_tb.tdt-7,20),1,13)
substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt-7),20),1,13)
substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt),20),1,13) I also have concerns about what is it that the SELECT DISTINCT subquery as its join back to tran_tb does not have an equality. This is why its a loop join, as hash and merge require an equality condition

Thanks joe,
This is the execution plan : RowsExecutesPhysical OPRowsIOCPUCOST
231NULL75,349NULLNULL256,749
231Compute Scalar75,3490.000256,749
231Parallelism75,3490.000256,749
234Hash Match75,3490.002256,749
924Parallelism301,3960.002256,747
924Hash Match301,3960.0018,559256,745
631952164Nested Loops5,677,478,9000.00131,844238,186
32715184Clustered Index Scan5,689,257137.6310148
752449143271518Table Spool11,0880.41011,561
924Hash Match11,0880.004206
18107564Compute Scalar512,0330.000202
18107564Clustered Index Scan512,033137.6321159 I will try to work on your suggestions. Thank again
Hi Joe,
Do you know how to convert substring(convert(char,dbo.tran_tb.tdt-7,20),1,13) to datetime ? Thanks
this query can actually run in 10min?
notice 1 clustered index scan results in 512,033 rows
the other 5,689,257 rows but the join generates 5,677,478,900!!! so i think there is a join condition missing, unless you really meant to do a cross join to the distinct query

Hi Joe,
Yes the query completes in 10 minutes but the 10 CPU’s are at 100% during this time. This is not my query and I don’t know why they don’t have a join condition… Anyway this is a strange query for me !

Sorry 4 CPU’s and 3 GB
i still its time to go ask the business owners of this query: what is the business logic being implemented, complete description, look for any indication whether the business logic actually implies a cross join
also check whether each CurrTo value in the subquery is distinct,

I will do so.
Thanks Joe
let me clarify on the subquery SELECT DISTINCT CurrFrom, CurrTo, PriorFrom, Prior
FROM tran_tb
WHERE xx as written, each row returned is distinct, but because the outer query groups by CurrTo, i am guessing that each row returned also has a distinct CurrTo value

1. I believe next code:
quote:
Count(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.CurrFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.CurrTo then dbo.tran_tb.tran_id end) CurrCount,
Count(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.PriorFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.PriorTo then dbo.tran_tb.tran_id end) PriorCount
is equivalent to:
count(*), count(*)
I think you meant:
sum(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.CurrFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.CurrTo
then 1
else 0
end) as CurrCount,
sum(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.PriorFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.PriorFrom
then 1
else 0
end) as PriorCount
2. Code:
select DateWindow.CurrTo,
sum(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.CurrFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.CurrTo
then 1
else 0
end) as CurrCount,
sum(Case when substring(convert(char,dbo.tran_tb.tdt,20),1,13) >= DateWindow.PriorFrom
and substring(convert(char,dbo.tran_tb.tdt,20),1,13) < DateWindow.PriorFrom
then 1
else 0
end) as PriorCount
from
(Select distinct
cast(substring(convert(char,dbo.tran_tb.tdt-7,20),1,13) + ‘:00:00’ as dateTime) as CurrFrom,
cast(substring(convert(char,dbo.tran_tb.tdt,20),1,13) + ‘:00:00’ as dateTime) as CurrTo,
cast(substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt-7),20),1,13) + ‘:00:00’ as dateTime) as PriorFrom,
cast(substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt),20),1,13) + ‘:00:00’ as dateTime) as PriorTo
from dbo.tran_tb
where
dbo.tran_tb.tdt > cast(substring(convert(char,getdate()-1,20),1,13) + ‘:00:00’ as dateTime) and
dbo.tran_tb.tdt <= getDate() and
((datePart(day, dbo.tran_tb.tdt) = datePart(day, getDate()-1) and datePart(hour, tran_tb.tdt) <> datePart(hour, getdate()) or
datePart(day, dbo.tran_tb.tdt) = datePart(day, getDate()) ) as DateWindow
join dbo.tran_tb as t on t.tdt >= DateWindow.PriorFrom and
t.tdt < DateWindow.CurrTo
where
t.tdt between DATEADD(HH,-2,getdate()-9) and getdate()
group by DateWindow.CurrTo
go
should be equivalent to your query (except for aggregate calculation changes). However I was not able to test it, so it should be reviewed and tested carefully. The main idea is to have conditions on DateTime column so index on such column can be effectively used. If you don’t have it, you should create one on tdt column. Please try that query and let me know if performance and cpu usage has improved and if query is completely equivalent with original one (except for aggregates).
I was wrong about aggregation part. This is interesting trick. Even count aggregate function eliminates nulls. So my query should be equivalent to yours.
You really should try to stay in the native datatype and not cast out to another if possible. With that said need to break this down so I fully understand what I am doing to your code. Please test your subquery of Select
distinct
substring(convert(char,dbo.tran_tb.tdt-7,20),1,13) CurrFrom,
substring(convert(char,dbo.tran_tb.tdt,20),1,13) CurrTo,
substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt-7),20),1,13) PriorFrom,
substring(convert(char,dateadd(hh,-1,dbo.tran_tb.tdt),20),1,13) PriorTo
from
dbo.tran_tb
where
substring(convert(char,dbo.tran_tb.tdt,20),1,13) > substring(convert(char,getdate()-1,20),1,13) and
substring(convert(char,dbo.tran_tb.tdt,20),1,13) <= substring(convert(char,getdate(),20),1,13) and
dbo.tran_tb.tdt between DATEADD(HH,-2,getdate()-1) and getdate()) That the following produces the same number of records and that other than the fact I mine produces the datetime version in full it produces the same records. Select
distinct
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),-7) CurrFrom,
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) CurrTo,
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt)-1,-7) PriorFrom,
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt)-1,0) PriorTo
from
dbo.tran_tb
where
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) between dateadd(n,1,dateadd(hh,datediff(hh,0,getdate()),-1)) and dateadd(hh,datediff(hh,0,getdate()),0) If it does then try this and see what you get SELECT
DateWindow.CurrTo,
Sum(Case when
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) BETWEEN DateWindow.CurrFrom AND DATEADD(n,-1,DateWindow.CurrTo)
then 1 else 0 end) CurrCount,
Sum(Case when
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) BETWEEN DateWindow.PriorFrom AND DATEADD(n,-1,DateWindow.PriorTo)
then 1 else 0 end) PriorCount
FROM
dbo.tran_tb
INNER JOIN
(
Select
distinct
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),-7) CurrFrom,
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) CurrTo,
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt)-1,-7) PriorFrom,
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt)-1,0) PriorTo
from
dbo.tran_tb
where
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) between dateadd(n,1,dateadd(hh,datediff(hh,0,getdate()),-1)) and dateadd(hh,datediff(hh,0,getdate()),0)
) as DateWindow
ON
dateadd(hh,datediff(hh,0,dbo.tran_tb.tdt),0) BETWEEN DateWindow.PriorFrom AND DATEADD(n,-1,DateWindow.CurrTo)
WHERE
dbo.tran_tb.tdt between DATEADD(HH,-2,getdate()-9) and getdate()
GROUP BY
DateWindow.CurrTo
Also you should have an index on tran_tb.tdt for this query to work best.
]]>