SQL Server Performance

Disk bottleneck?

Discussion in 'Performance Tuning for Hardware Configurations' started by soft_karma, Mar 23, 2004.

  1. soft_karma New Member

    Hi everyone,
    my SQL server is a very powerful machine:
    Windows 2003 Enterprise
    4 CPU Xeon 2500, with Hyperthreading enabled
    16Gb RAM, /3GB and /PAE enabled
    2 SCSI Disk for OS, 15000rpm with RAID 1+0
    5 SCSI Disk for Database Files, 15000rpm with RAID 5
    1 controller HP SmartArray 6400

    I've put all the files on the second Drive. I notice that the %idle time of this unit is always 0%. Then to improve the performance I've tried to put the transaction log on the first drive but I don't really notice a change. Now, when the %idle time of the first array is 0 the second one is 100% and viceversa. I think that the problem could be the controller, that is too busy... Do you think that an additional controller, with 2 SCSI unit can help? Or we can just add 2 disk to the actual controller where we will put the transaction log file?

    Here's our counters:

    Object: Memory

    Page Faults/sec108.009
    Page Reads/sec0.000
    Page Writes/sec0.000
    Transition Faults/sec48.004
    Transition Pages RePurposed/sec0.000

    Object: LogicalDisk

    C:E:
    % Idle Time96.6605.920
    Avg. Disk Queue Length0.0334.751

    Object: PhysicalDisk

    0 C:1 E:
    % Idle Time96.6705.920
    Avg. Disk Bytes/Read0.0009294.769
    Avg. Disk Bytes/Transfer54613.33314301.753
    Avg. Disk Bytes/Write54613.33347681.641

    Object: SQLServer:Access Methods


    Index Searches/sec12556.999


    Do you think that this machine is running poorly? Roughly we have something like 5 million row per day.
  2. Rob612 New Member

    I am adding to the preceeding (working with SK on the problem) that we have absolutely no CPU problems, running at less than 10%.
  3. satya Moderator

    Any performance bottleneck on SQL database other than H/w?

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  4. soft_karma New Member

    We aren't developers, but System Administrator. We are building up the architecture for that software so we can only work on hw optimization.
  5. Argyle New Member

    I don't think it's the controller. In my experience it's often the disks. The controller usually has no problem pushing the data around.

    Run the below script and see if you get any high wait stats on for example writelog. If so I would look into:
    - a controller that has a battery backed up cache that could be used for caching writes
    - move away from RAID 5 to RAID 0+1 instead
    - Or even better identify what in the application is causing the high disk usage. Is there a batch job or query that are writing way to much than it needs and could be tuned instead?

    Script from sqldev.net. Edit the finish value if you want to run it longer. Right now it loops 10 times, waiting for 10 seconds and then shows the average:

    -- simple waitstate script
    -- GertD@SQLDev.Net
    --
    if ((object_id('waitstats') is not null) and
    (objectproperty(object_id('waitstats'), 'IsUserTable') = 1))
    drop table waitstats

    create table waitstats (
    wait_type varchar(80),
    requests numeric(18, 1),
    wait_time numeric (18 ,1),
    signal_wait_time numeric(18, 1),
    time_stamp datetime default getdate()
    )

    declare @start int,
    @finish int
    select @start = 1,
    @finish = 10

    dbcc sqlperf(waitstats,clear) -- clear out wait statistics

    while (@start < @finish)
    begin
    begin transaction
    insert into waitstats(wait_type, requests, wait_time,signal_wait_time)
    exec ('dbcc sqlperf(waitstats)')
    commit

    select @start = @start + 1
    waitfor delay '00:00:10' -- every 10 seconds
    end

    declare @begin_time datetime,
    @end_time datetime,
    @total_wait numeric(18, 1)

    select @begin_time = min(time_stamp),
    @end_time = max(time_stamp)
    from waitstats where wait_type = 'Total'

    --- subtract waitfor, sleep, and resource_queue from Total
    select @total_wait = sum(wait_time) + 1
    from waitstats
    where wait_type not in ('WAITFOR', 'SLEEP', 'RESOURCE_QUEUE', 'Total',
    '***total***')
    and time_stamp = @end_time

    -- insert adjusted totals, rank by percentage descending
    insert into waitstats values('***total***', 0, @total_wait, @total_wait,
    @end_time)

    -- reporting
    select @begin_time as [begin time],
    @end_time as [end time],
    datediff(mi, @begin_time, @end_time) as [duration in min]

    select wait_type,
    wait_time,
    percentage = cast(100 * wait_time / @total_wait as numeric(18,1))
    from waitstats
    where wait_type not in ('WAITFOR', 'SLEEP', 'RESOURCE_QUEUE', 'Total')
    and time_stamp = @end_time
    order by percentage desc
    Example output from one of our systems that have disk performance issues:


    waite_typewait_timepercentage
    ***total***23040.0100.0
    WRITELOG13726.059.6
    NETWORKIO9092.039.5
    PAGELATCH_UP174.0.8
    PAGELATCH_SH15.0.1
    PAGEIOLATCH_SH16.0.1
    LCK_M_S16.0.1
    LCK_M_U.0.0
    LCK_M_X.0.0
    LCK_M_IS.0.0
    LCK_M_IU.0.0
    LCK_M_IX.0.0
    LCK_M_SIU.0.0
    ..
    ..
  6. joechang New Member

    for e: drive, monitor:
    avg disk queue length
    avg disk sec /read & /write
    read/sec
    writes/sec


  7. Rob612 New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by Argyle</i><br /><br />I don't think it's the controller. In my experience it's often the disks. The controller usually has no problem pushing the data around.<hr height="1" noshade id="quote"></blockquote id="quote"></font id="quote"><br /><br />Yeah, right. Actually that was a misleading information. Sure enough the bottleneck is not on the controller (beside the fact that we are still waiting for the battery backed cache to be installed, ancd this should help a lot) but on the disks.<br /><br />Last update, we have done the following:<br /><br />Changed the transaction log to simple mode<br />Moved the transaction log to another disk (yes, both datafile and tr.log were on the same disk (E<img src='/community/emoticons/emotion-1.gif' alt=':)' /> not out fault, the "supposed" VAR did it...)<br />Configured to work in "fiber" mode<br /><br />Now the situation seems to be better, at least we have - sometimes - %disk_idle that is "different" from 0%, while before it was always 0%.<br /><br />The real question is that - not being "real" DBAs (please, do not ask why we are doing this job then, not out fault, we just got involved [B)]) - we have no idea of what are the "correct" performances we should expect. Now the system seems to perform correctly, but some operations are still awfully slow. I guess that asking a "reference" value - even supplying some numbers as we did is not an option, right ?
  8. satya Moderator

    Can you get any information about SQL behaviour from Developers/DBAs at your end.

    quote:Originally posted by soft_karma

    We aren't developers, but System Administrator. We are building up the architecture for that software so we can only work on hw optimization.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.

Share This Page