Update performance issue | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Update performance issue

I have a very simple update command that is taking an enormous amount of time to run in a production installation. In test runs with about a million and a half rows it runs in about 40 seconds, but in the installation with I would guess around 10 million rows, it runs in 11 hours! The query is shown below: UPDATE IMP_HDR
SET CUST_ID = C.CUST_ID,
CUST_MATCH_CODE = ‘C’
FROM
CUST C
WHERE
C.CLIENT_ID = IMP_HDR.CLIENT_ID AND
C.LAST_CUSTOMER_ID = IMP_HDR.CUSTOMER_ID
; The primary keys IMP_HDR.ID (not involved in query) and CUST.CUST_ID
They are both Non-clustered.
Indexes also exist on IMP_HDR.CUST_ID, CUST.(CLIENT_ID,LAST_CUSTOMER_ID)
Simple recovery model logging is being used. My best guess is the changing of the data in the IMP_HDR.CUST_ID field from null to a value is expanding the record size causing all pages to be rewritten. Is that possible? Any other opinions, or insight on how to fix this is greatly appreciated. Thanks,
Dale Lear.
ON the production installation make sure the indexes are taken care from defragmentation and in this case I would suggest to look at the execution plan. Satya SKJ
Microsoft SQL Server MVP
Contributing Editor & Forums Moderator
http://www.SQL-Server-Performance.Com
This posting is provided �AS IS� with no rights for the sake of knowledge sharing.
Thanks for the reply. The execution plan shows a has table being built from the CUST table. Perhaps it is reunning out of hash memory in the production system. I might try doing the query in steps, using one chunk of the CUST file at a time. I will let this forum know what solution was found.
D.
The problem was solved by dividing the update into 10 parts:
Fo example, here was the query for the second 10%: UPDATE IMP_HDR WITH (TABLOCKX)
SET CUST_ID = C.CUST_ID,
CUST_MATCH_CODE = ‘C’
FROM
CUST C
WHERE
IMP_HDR.CUST_MATCH_CODE = ‘N’ and
C.CLIENT_ID = IMP_HDR.CLIENT_ID AND
C.LAST_CUSTOMER_ID = IMP_HDR.CUSTOMER_ID AND
ID BETWEEN (SELECT MAX(ID) FROM IMP_HDR)/10 AND 2*(SELECT MAX(ID) FROM IMP_HDR)/10 Each 10% ran in about 2 minutes, changing the overall time to 20 minutes (from the original 11 hours!) I think adding the restriction not only broke the problem into more managable pieces for SQL Server, but also persuaded it to use a better execution plan.
Hi, The addition of "IMP_HDR.CUST_MATCH_CODE = ‘N’" and the between condition on the primary key of IMP_HDR could have resulted in less number rows selected for scanning the CUST C table. I think this the reason why the query performance improved so much.
Question for all: Is this statement – UPDATE IMP_HDR
SET CUST_ID = C.CUST_ID,
CUST_MATCH_CODE = ‘C’
FROM CUST C
WHERE C.CLIENT_ID = IMP_HDR.CLIENT_ID AND
C.LAST_CUSTOMER_ID = IMP_HDR.CUSTOMER_ID equivalent to this statement – UPDATE IMP_HDR
SET CUST_ID = C.CUST_ID,
CUST_MATCH_CODE = ‘C’
FROM CUST C INNER JOIN IMP_HDR ih ON C.CLIENT_ID = IMP_HDR.CLIENT_ID
AND C.LAST_CUSTOMER_ID = IMP_HDR.CUSTOMER_ID SQLGuru
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br /> Question for all: Is this statement -<br /><br />UPDATE IMP_HDR<br />SET CUST_ID = C.CUST_ID,<br />CUST_MATCH_CODE = ‘C'<br />FROM CUST C<br />WHERE C.CLIENT_ID = IMP_HDR.CLIENT_ID AND<br />C.LAST_CUSTOMER_ID = IMP_HDR.CUSTOMER_ID<br /><br />equivalent to this statement -<br /><br />UPDATE IMP_HDR<br />SET CUST_ID = C.CUST_ID,<br />CUST_MATCH_CODE = ‘C'<br />FROM CUST C INNER JOIN IMP_HDR ih ON C.CLIENT_ID = IMP_HDR.CLIENT_ID<br />AND C.LAST_CUSTOMER_ID = IMP_HDR.CUSTOMER_ID<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />You mean apart from the fact that your second statement will never execute the way it is written? Nice try! [<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a>
]]>