SQL Server Performance

Sporadic high CPU usage grinds queries to a halt

Discussion in 'Performance Tuning for DBAs' started by Pendle_Gazza, Sep 2, 2004.

  1. Pendle_Gazza New Member

    Hi

    We are running a news article database on a server with the spec below, and periodically (now probably twice a day), the CPU usage will go above 70% and never go back below that figure. It has the knock on effect of causing all webpages accessing the database to take in excess of one minute to load.

    We have tried running performance monitor and Profiler at the time of collapse, and read as much documentation as possible, yet can find nothing untoward which may explain these strange occurrences. If anybody could help by suggesting anything we would be exceptionally grateful, as this is causing us no end of hassle.

    Previous things tried have included installing a fresh copy of everything on a new server, and recreating the database with a new file importing only recent relevant data.

    Regards
    Gary Broughton

    Dual 1GHZ Pentium III
    Windows 2000 SP3
    SQL Server 2000 SP3a
    2GB RAM
    One database 1.2GB
  2. satya Moderator

    What is the memory setting on SQL?
    Any other applications sharing SQL server server resources?
    What is the level of service packs on SQL & OS?
    How often you perform DBCC checks & reindexing on databases?

    Collect PROCESS counter from PERFMON in order to see which process is utilising most of resources.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  3. Pendle_Gazza New Member

    Hi satya

    We've got 2048MB, which is dynamically configured in SQL from 0 to 2048MB. Reserve physical memory is unchecked, and minimum query memory is 1024KB. Both processors are used, with 'boost SQL' and 'use fibers' unchecked (though both have been on while testing in the past). SQL Server 2000 SP3 (8.00.760) is the only "third-party" application installed on the server, and Windows 2000 is on SP4 with all the updates. Terminal Services is used, but IIS is disabled.

    DBCC CHECKDB shows "CHECKDB found 0 allocation errors and 0 consistency errors in database 'newsdb'" - I've never seen anything different with that. I've run index tuning wizard on several trace files, although only once did it ever add an index. Sqlsrvr and idle are utilising most resources, everything else is hardly on the scale.

    This is so bizarre as it started happening a couple of months ago, and tends to go wrong late in the evening, occassionally but not always at the same time (there are no scheduled tasks doing anything to the database at these times).

    Many thanks for your reply
    Gary
  4. Adriaan New Member

    Terminal Services is as big a resource-hog as SQL Server, so they'll compete like mad. My first test would be to disable TS and see what good that brings, but of course you may need to bring in another server to do TS.
  5. Luis Martin Moderator

    There is any scheduled job during this time?


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    All postings are provided “AS IS” with no warranties for accuracy.

  6. satya Moderator

    I agree with Adriaan to disable the TS on SQL Server and test the performance.
    It looks like its time to fine tune the long running queries in order to gain the performance during resource-intensive queries on SQL server.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  7. Pendle_Gazza New Member

    Hi

    There are no scheduled jobs (at least which we know about) that clash with the times it goes wrong, and the Profiler at these periods suggests nothing sinister. Also, while the TS service is installed, it is very rarely used. TS has been on for many months (I need some method of remote connection), and the problem has only surfaced about two months ago. It's just so weird that the CPU fluctuates between 70% and 100% when it goes wrong and grinds all things SQL to a halt.

    Thanks
    Gary
  8. Dave Wells New Member

    I know that this may sound like "state the bleedin obvious" but have you checked that it is in fact sql which is using the cpu?
  9. Pendle_Gazza New Member

    Hi Dave, yes I'm afraid it is. I really do wish it was something else! Everything SQL Server grinds to an almost complete halt - Enterprise Manager, Profiler etc., and often the services fail to restart and the machine has to be rebooted (after which its fine for a while).

    Regards
    Gary
  10. joechang New Member

    what are the characteristics of your database application,<br />ie, i take it that there were no very expensive calls that showed up on profiler?<br />before the your system becomes unresponsive, what is the system CPU %<br />what is the SQLServer<img src='/community/emoticons/emotion-7.gif' alt=':S' />QL Statistics - Batch Requests/sec<br />how does the perfmon counter: Process - Virtual Bytes, sqlservr trend before and during the problem?<br />does your applications make extended stored procedure calls? example sp_prepare, sp_cursor_open
  11. derrickleggett New Member

    Does this happen every day, or just periodically in addition to happening in the evenings? Are you saving perfmon counters to a file so you can inspect historically during these times? Same for Profiler traces?

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  12. satya Moderator

    Any anti-virus applications installed or have you checked for any virus attacks.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  13. Pendle_Gazza New Member

    The most expensive calls on the profiler are usually the 'Audit Logout' ones. When working fine, the CPU fluctuates wildly (always sqlsrvr), but I would estimate an overall average of 30%. The batch requests per second today (working fine so far) peaked at 1310 and averaged 378. Virtual bytes are steady at 1810535219. Stored procedures are not used on this database.
  14. satya Moderator

    So the problem is now narrowed down to the so-called Audit Logout event and figure out the reasons for the existence in Audit Logout events' duration between private network connection and remote connection, and what needs to be done to improve the performance.

    Check the connection string used between private and remote connections to the sQL Server.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  15. Pendle_Gazza New Member

    I don't think that is the problem, as it happens when the server is running fine - just pointing out that it had the longest duration.

    I have now taken a profile and perfmon when it's wrong (it's hard to catch before and as it starts!) and have noticed there seem to be around 15 'Lock Released' or 'Lock Acquired' messages at exactly the same millisecond every 0.017 seconds.

    I don't know if forwarding any data via MSN or Yahoo is permitted (?), but if anyone would mind casting even the briefest eye over the data, and or the perfmon graph, I would be eternally grateful.
  16. joechang New Member

    i have seen symptoms like this before but was never able to completely resolve it.
    both cases, it W2K and P4 based Xeon systems.
    someone speculated that it could be a problem of the W2K scheduler not behaving well on the P4 processor architecture (the internal of the P4 does aggressive out-of-order ops) and that such problems were fixed in w2k3.
    another speculation was that it only occured in 8 CPU systems.
    but you have a 2 proc PIII

    i think this is worth opening a MS PSS case, if you think it will reoccur.
    you can actually get the PSS script on the MS downloads by
    searching: PSS
    get the PSS Service Center Labs, and Microsoft Product Support's Reporting Tools
    i think the best way is to get the tools from the above, let it run (with plenty of disk space)
    if you can capture the onset of the occurence, then open the PSS case, otherwise no sense in paying the $99 if can't send MS the problem trace

  17. derrickleggett New Member

    Jose, we had the same problem with a Scheduler causing problems like this. It was on an HP DL760 8-processor box hooked up to the EMC SAN running Windows 2k3 though. Microsoft and HP said it wasn't what was causing our problems. We ended up failing back to W2k Advanced Server after replacing drivers, hardware, etc. Since then, PSS and HP have found the problem.

    I would call PSS and open up a ticket for this though. If you are a partner, it won't cost you anything. If you aren't, it's $250 and well worth the money. You can't be rebooting servers during the middle of the day. The forums are nice to troubleshoot, but there's only so much we can do if we're not there. PSS and your vendors have the engineering resources to throw at issues like this.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  18. joechang New Member

    PSS by email is $99?, tel call is $250
    did PSS & HP provide details on the problem?
  19. derrickleggett New Member

    Yeah, I'd have to search up the old notes on it. I'll see if I can pull it together. I never seem to get around to stuff like that though when I have to dig through archives. <img src='/community/emoticons/emotion-1.gif' alt=':)' /> If I recall, it wasn't the scheduler actually "causing" the problems. It was a issue with the disk IO speed, a driver, and w2k3 not mixing well. We would get a ton of Scheduler errors, then the CPU would max out, and the server would lock up after about 30 minutes of this. The pattern just reapeated over and over. It was really aggrivating.<br /><br />MeanOldDBA<br />derrickleggett@hotmail.com<br /><br />When life gives you a lemon, fire the DBA.
  20. Pendle_Gazza New Member

    Hi

    Unfortunately been away for three days. Could you please clarify who or what PSS is please, as I've not heard of them/it?

    One strange thing about this is that the database worked happily on the same machine for about three years, and only started going wrong in July. It does make me wonder if a Windows Update patch has had an adverse effect, although as one of the attempted subsequent solutions was to reinstall from scratch we're unable to tell what was installed when of course. :-(

    Thanks
    Gary
  21. satya Moderator

Share This Page