SQL Server Performance

Hard drive suddenly running slow??

Discussion in 'General DBA Questions' started by Will192, May 29, 2007.

  1. Will192 New Member

    Over the last 24hrs my backup speed has went from about 20-30 MB/sec to 2-5 MB/sec. The backup job is the only thing running on the server at this time.

    1. My server is running RAID 5. (i know, i know, goto RAID 10)
    2. There is plenty of free space on all my drives.
    3. Windows Automatic updates are off.
    4. There are no hard drives failing right now.
    5. All other drive access seems to be slow also.
    6. System idle processes is at around 97%, when no jobs are running
    7. Out of 8gb RAM, 200mb is free right now and there is a big job running in SQL Server

    I have not defragged my drives in a while, but everything was running fast 2 days ago. I would expect that if it was too fragmented, then it would have slowly crept up on me.

    Thanks in advance for any responses to this post.


    Any idea on what may be causing my slow down?
    --------------------------------------------------------------------------------
    Live to Throw
    Throw to Live
    Will Summers

    Live to Throw
    Throw to Live
  2. satya Moderator

    Do you any anti-virus or anti-spyware tool installed on thsi server?
    Have you checked for any sort of warnings on event viewer?

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  3. Will192 New Member

    Thanks for the reply. No events in the viewer for application, system or security that remotely relate to hard drives or that are new in the last 24hrs.

    Live to Throw
    Throw to Live
  4. satya Moderator

    Do you any anti-virus or anti-spyware tool installed on thsi server?

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  5. Will192 New Member

    Mcafee Virus Scan Enterprise 8. I just made sure it had the latest updates and ran a full scan on all local hard drives. It found nothing.

    Live to Throw
    Throw to Live
  6. satya Moderator

    Have you included in this AV scan for the drive where the backups are stored from SQL Server?
    Have you collected the memory, physcial disk usage during this operation using PERFMON?

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  7. Will192 New Member

    The backups are done locally, not across the network. I scanned all local drives.

    I didn't collect the info, but I did watch the performance tab during the scan and it didn't impact the system much.

    Live to Throw
    Throw to Live
  8. joechang New Member

    look in the SQL error logs for entries like: allocate 1024*1024 (1045xxx) failed

    the backup process tries to allocate 10 1M blocks of VAS
    if you don't got it, it steps down until 64K

    if this happens, you are screwed, not just for backups
    restart SQL helps

    search VAS or Virtual Address Space
  9. Will192 New Member

    Here's the only message that raises a flag for me:

    SQL Server has encountered 15 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [E:xxxx_Log.LDF] in database [xxxx] (7). The OS file handle is 0x000005A8. The offset of the latest long IO is: 0x000000feed5a00

    There are over 20 of these type within the last 24hrs. Before that there are none within the last week.

    Live to Throw
    Throw to Live
  10. joechang New Member

    thats a very serious issue
    IO to the log should take 0.1 milli-sec for normal writes to the log
    maybe jumping to 10-20 ms for log backups

    15 seconds or 15,000ms means either your disks, or the controllers or the drive has serious issues, ie, may fail soon

    how old are the disks?
    what HBA/controller do you have?
    do the storage component vendor have diagnostic tools?

    if this database has inportant data, consider moving to a new storage system
  11. satya Moderator

    I have asked you to refer for such messages, you said nothing found!

    Anyway that indicates IO operation has been issued, and it has taken over 15 seconds for that operation to return through the IO subsystem. The problem is certainly below SQL server in the IO stack. Refer to this KBAhttp://support.microsoft.com/kb/897284/en-us fyi.




    quote:Originally posted by Will192

    Here's the only message that raises a flag for me:

    SQL Server has encountered 15 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [E:xxxx_Log.LDF] in database [xxxx] (7). The OS file handle is 0x000005A8. The offset of the latest long IO is: 0x000000feed5a00

    There are over 20 of these type within the last 24hrs. Before that there are none within the last week.

    Live to Throw
    Throw to Live

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  12. Will192 New Member

    Originally posted by satya

    I have asked you to refer for such messages, you said nothing found!


    I was only looking at the log around the time of the initial slow down. These messages were posting when other jobs were running a couple hours later. My fault.

    Live to Throw
    Throw to Live
  13. Will192 New Member

    Just an update. I have been working with dell and they did their Dset scan and found no problems. They had me download and ISO image that contains updates for all the drivers on my server. I plan on applying it this weekend and then doing a low level scan of the hard drives. I cannot do this while in production for obvious reasons.

    I don't see how new drivers will fix this since the current drivers were working fine before this and I didn't change anything. This seems to be a 'blanket' fix because they don't see a problem.

    I'm going to make sure my backups have been moved off the machine before I run the updates!

    Live to Throw
    Throw to Live
  14. satya Moderator

    OK I understand the delay for such error notification.

    Good to go with your plan having the complete backup sets for system & user databases, don't forget to copy them to a seperate server with a RESTORE option too, just copying them on different server will not assure you that they are good to go.

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  15. Will192 New Member

    Sad to say that I don't have the luxury of restoring them to a different server. There are about 15, 30gb data/10gb log databases. I just don't have another machine that has over a half a terabyte free.

    Nightly, to all my databases, I backup, script, bcp dump then zip everything and copy it to another machine. I figure that's the next best thing. For my developers, every couple of months I am asked to make a copy of a database onto another server. I say that these restores are my testing of my backups.

    Live to Throw
    Throw to Live
  16. satya Moderator

    Uh oh! then ensure you have verified the backups by all means or if the hardware vendor can swap the drives loan few of them to ensure they are safe.

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  17. Will192 New Member

    For the situation that I am in, this is the best backup solution that I know. I know it's not optimal, but by 2pm everyday I have 14 days worth of zipped backups and the current day unzipped on the server. Off of the server I have the previous 4 days worth of backups. Every week I take the most current zipped backup and burn it to DVD.

    The last time that I uncompressed one of my backups and restored them was about 3-4 weeks ago. About once every 1 or 2 months I have a request to restore a database on a development server. I don't think that I have any reason to suspect that any of my files are corrupt, but I will talk to my boss about getting some more hard drive space on my local machine so that I can restore databases for testing.

    Live to Throw
    Throw to Live
  18. satya Moderator

    I can understand the problem at your end, let us know the outcome of your approach in thsi case.

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
  19. Will192 New Member

    Update :
    1. updated drivers and new maintenance software from dell, problem still exists
    2. drives seem to go into 'slow mode' when system is stressed for a long period
    3. if system is not stressed for a long period, then drives don't go into 'slow mode'

    I have not had enough system down-time to run a full scan on all the hard drives. When I run the scan it puts the drives into 'slow-mode' and requires a reboot to clear, so I can't run the scan while in production. I will have to do one drive at a time this weekend and see how much I can get finished. I only have maintenance windows on the weekend, so I've only had one chance to run the tests since installing the new drivers and scanning software.

    When I say 'stressed for a long period', I mean a process that handle 10s of millions of rows that runs for at least 20-30 minutes. When I say 'slow-mode', that's my name for it. I don't have enough info on the problem to give it a better name. My other names for it aren't fit for this forum anyway. ha ha.

    Live to Throw
    Throw to Live
  20. Will192 New Member

    I hate to take a chance on jinxing my server, but I think we may have solved the problem.

    Turn out that the battery on the PERC 3 card that controls the array was slowly going dead. It would go into recharge mode and never come out because it wouldn't charge to full capacity. During the recharge it would operate without cacheing. Hence the slow response time.

    We replaced the PERC 3 card with a PERC 4 card on Monday and haven't went into 'slow-mode' since. Keeping my fingers crossed that it doesn't come back. We are also seeing about a 20-25% increase in speed with the new card.

    We will wait a couple of weeks and then replace the PERC 3 card that control the internal array to a PERC 4. The PERC 3 operates at 160 and the PERC 4 operates at 320, so we should see an increase in speed on the internal array also. That array holds the operating system, so I'm looking forward to that upgrade!

    Live to Throw
    Throw to Live

Share This Page