SQL Server Performance

Real Life Performance Riddle. CPU 100% Yow!

Discussion in 'Performance Tuning for DBAs' started by daustinash, May 15, 2006.

  1. daustinash New Member

    This is just like a MCDBA exam question, just not multiple choice.

    I have a database server that started red-lining at 95-100% processer usage sometime last week. I'd know for sure, but I'm not officially the dba here (we don't have one belive it or not). There are approximately 50 databases on it supporting mostly low traffic web applications. Half a dozen create 85% of the traffic.

    Server Spec:
    1. SQL Server 2000 Standard Edition
    2. Service pack 3
    3. 2 dual core Pentium Xeon 2.8 Ghz processors (all 4 processes red lining)
    4. 2 Gb RAM, which is all being recognized and used by the MSSQLServer process
    5. One disk controller, 2 channels, 2 drives on channel 0, 4 drives on channel 1 (RAID 5)
    6. All *.mdf and *.ldf files located on the RAID 5 array (rats)

    Performance monitor measures: (fyi I have eliminated Network Interface as a potential issue)
    1. Physical Disk: Avg Disk Bytes
    ead19819 bytes
    ead
    2. Physical Disk: Avg Disk Queue Length.200
    3. Physical Disk: Disk Bytessec750000 bytessec
    4. SQL Server:Buffer Manager:Free Pages4800
    5. SQL Server:Buffer Manager:Buffer Cache Hit Ratio99.865%
    6. Memory: Available Bytes (avg)118 Mb
    7. Memory: Page faultssec1272

    I have run filemon.exe on the *.mdf files and notice 6 databases creating the vast majority of disk IO in 8192 byte blocks, which would indicate that Sql Server is reading Extents one 8K page at a time rather than by whole extents. Normally this would indicate a severe index fragmentation, so I have rebuilt the indexes on these databases to no avail.

    Here are the questions:

    1. Does 750000 bytessec indicate high disk IO on a SCSI RAID 5 array and could this cause excessive page faults (is 1272 persec high?) causing the CPU struggle?

    2. Rebuilding the indexes didn't help Sql Server read Extents rather than individual 8K pages. Could this problem perhaps be caused by a lack of indexing all together or fragmented *.mdf files on the hard disk array (disk defragmenter indicates 28% file fragmentation, and 14% overall fragmentation?

    Thanks for any comments and enjoy? I'll post the answer when it gets figured out!

    daustinash

    dnash
    dave@gooseneck.us
  2. Luis Martin Moderator

    I've moved to relevant forum.


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  3. mmarovic Active Member

    Luis, I think it is not certification related question.
  4. Luis Martin Moderator

    Yes, I have some doubs. Wath you suggest?


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  5. mmarovic Active Member

    I think performance tunning for dbas is the right place. He said it looks like mcdba question, but that was a joke. It is real production problem.
  6. Luis Martin Moderator

    Sorry.

    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  7. Twan New Member

    Hi ya,

    can you run SQLProfiler to see if you have any queries which take more than 100ms both in terms of duration and cpu?

    Cheers
    Twan
  8. daustinash New Member

    I constantly get queries with durations between 1000 ms and 30000 ms. I get about 5-7% CPU durations over 100 ms too.

    daustinash
    dnash@verndale.com
  9. Luis Martin Moderator

    And what those queries with more than 1000ms can be optimized?. New indexes?


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  10. satya Moderator

    http://sqljunkies.com/Article/7F8518F9-FDAA-4FF3-8FC5-25E8946C8D0C.scuk fyi too.

    Satya SKJ
    Microsoft SQL Server MVP
    Contributing Editor & Forums Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  11. mmarovic Active Member

    quote:Originally posted by LuisMartin
    And what those queries with more than 1000ms can be optimized?. New indexes?
    I guess so, long queries with low cpu, probably blocking because too much rows are accessed without good indexes supporting queries.
    One or two worst queries might give a picture about what is going on.
  12. daustinash New Member

    I think you guys are right. I think the problem stems from poorly written queries against tables with very little to no useful indexing in poorly maintained databases. I can't really hope to analyze and fix the queries for 50 databases today. But there is still a nagging question here.

    I don't know how many of you are familiar with the utility filemon.exe. It is a free utility downloadable from many places, just google it. It monitors and displays file IO at the file level. You can set a filter to monitor only *.mdf files.

    SQL Server attempts to retrieve data from disk by the Extent, which is eight 8K data pages. If the database indexes are not fragmented, file mon will show a length of at lease 65,535 bytes (8 8K pages or 1 Extent). If not, and it has to read each of the 8 pages from 8 different locations on the physical disk, you will see eight reads of 8192 bytes to get the same work done.

    I have rebuilt all my indexes to hopefully eliminate index fragmentation, but still get only 8192 byte reads 95% of the time.

    Could this be due to a LACK of useful indexes, or due to *.mdf file fragmentation all over the disk?

    A conundrum to be sure.

    daustinash
    dnash@verndale.com
  13. joechang New Member

    no, your disks are fine,
    on a cold start, sql server access 64k at a time even if you only need 8k,
    afterwards, if you only need 8k, it gets 8k

    besides, your disk load is really light,
    there is no way your cpu would be pegged if you had a disk problem
    your problem for now is entirely in the high cpu consumption due to poorly written queries, poor indexing or table design
  14. daustinash New Member

    I hear you joechang. That's what I'm afraid of, but a couple of things.

    1. I didn't think about the fact that the server might only need 8192, and that therefore 8192 reads might not be a problem. Does this hold true even when you see (in filemon.exe) 200 consecutive entries for the same database scroll by, all 8192 in length.

    2. My disk load went up to just over 2.5 million bytes/sec at the height of the afternoon, up from 750,000 at 9:00 am this morning.

    I'm chalking this up to a cumulative pile of poorly designed and indexed databases with poorly written queries finally serving this server some trouble.

    Now I either need to tune about 15 databases/applications, or distribute the load across another server and make sure we do better in the future on a new server.

    Just goes to show you that good design, indexing, and tsql programming are an important skill. No one of these databases is a problem by itself, and the engineers here have always been under pressure to just "get things done". Everyone can work Enterprise Manager, but no one can build a good database application. Now the pile has gotten high enough that it's a big problem with no "quick hit" sort of fix.

    daustinash
    dnash@verndale.com
  15. joechang New Member

    if you want to contract out this work,
    see my subsite:

    http://www.sql-server-performance.com/qdpma/
  16. Luis Martin Moderator

    And if you are below mexico, send me a mail[8D].


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  17. cmdr_skywalker New Member

    i suggest you increase the virtual page to (4GB) and leave at least 512MB to your OS (don't set the SQL Memory allocation to maximum 2GB). Run the following script before and after doing changes to your settings (or off and peak hours).

    PRINT '--------------------------------------------------------------------------'
    PRINT '-= Performance Baseline Generator v1.0 =-'
    PRINT '--------------------------------------------------------------------------'
    PRINT 'Author: Perez, Jeffrey '
    PRINT 'Date: ' + CONVERT(VARCHAR, GETDATE(), 100)
    PRINT 'Server: '+ CAST(SERVERPROPERTY('ServerName') AS VARCHAR)
    PRINT 'User: ' + SUSER_SNAME() + ', ' + CURRENT_USER
    PRINT 'Workstation: ' + HOST_NAME()
    PRINT 'Version Info:'
    PRINT ' ' + REPLACE(@@VERSION,CHAR(9),' ')
    PRINT '--------------------------------------------------------------------------'
    GO
    PRINT 'Running Test...'
    GO
    IF (SELECT OBJECT_ID('test')) IS NOT NULL DROP TABLE test
    GO
    CREATE TABLE test(
    SAID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    DATA CHAR(8000)
    )
    GO
    DECLARE
    @Counter INT,
    @Start DATETIME,
    @End DATETIME,
    @Limit INT

    SET NOCOUNT ON
    SET @Start = GETDATE()
    SET @Counter = 1
    SET @Limit = 10000
    WHILE @Counter < @Limit
    BEGIN
    INSERT INTO TEST(DATA)
    SELECT SPACE(8000)

    SET @Counter = @Counter + 1
    END
    SET @End = GETDATE()
    PRINT 'Test Complete.'
    PRINT ''
    PRINT 'Database: ' + DB_NAME()
    PRINT 'Start Time:' + CONVERT(VARCHAR, @Start,114)
    PRINT 'End Time:' + CONVERT(VARCHAR, @End,114)
    PRINT 'Baseline Mark: ' + LTRIM(STR(@Limit/DATEDIFF(ss, @Start, @End))) + ' transaction/s'
    GO
    IF (SELECT OBJECT_ID('test')) IS NOT NULL DROP TABLE test
    GO
    PRINT '--------------------------------------------------------------------------'


    May the Almighty God bless us all!
    www.empoweredinformation.com
  18. cmdr_skywalker New Member

    by the way, what is the load of channel one where you don't have the mdf/log files? You can distribute the log to maximize the throughput of the RAID. Add additional files into your channel 1 drive to distribute the load. See the fixlist of SQL SP4.

    May the Almighty God bless us all!
    www.empoweredinformation.com

Share This Page