IO requests taking more than 15 seconds… | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

IO requests taking more than 15 seconds…

HP ML570,Single HP P400 contoller 512MB Array Accelerator set to 100% Write System 2 SAS 15K drives Raid 1 All System Files, SQL Server binaries
Backup 4 SAS 10K drives Raid 5Backups for all databases
Data 8 SAS 15K drives Raid 10 Data and log files for most databases
TempDB 2 SAS 15K drives Raid 1 Data and log files for tempdb
Log 2 SAS 15K drives Raid 1 Single log file for biggest, busiest datbase We getting following error during off peak hours (no connected users) (!)
when data migration process runs (lots of selects * from x with (nolock) with JOINs.. couple of million rows total) SQL Server has encountered xxxx occurrence(s) of IO requests taking longer than 15 seconds
to complete on file [T:TempDB empdb.mdf] in database [tempdb] (2). Where xxxx is anywhere from 1 to 8000… While this is happening we are also seeing spikes on TempDB drives’ perfmon sec/write and sec/read counters up to 60+ seconds. If we run the same data migration process during peak hours We don’t receive any errors (!) …
Also, We don’t see any signficant spikes in sec/write sec/read on TempDB drives. What could be causing this ? I can think of following:
– Drives are getting "stuck" or going to sleep / HP surface scan being on … during idle, off peak hours
– SQL engine executing queries under a different execution plan (i.e. using hash join that uses tempdb) during off peak hours
– Hardware problem HP Diagnostic tools are not reporting any problems with drives/controller.
Also, No Anti Virus client, Symantec Backup Agent installed but not processing at that point.
Newest drivers have installed been for controller and all drives. Best solution would be to add 2+ drives to TempDB, but there are no more empty drive bays.. I’m thinking of these possible solutions in no particular order:
– Add extra controller for Log, TempDB and Backup drives
– Move TempDB to Data drive (alltogether and possibly assign 2 TempDB drives to Data)
– Move Tempdb log to Data Drive, leave TempDB data files on TempDB drive
– Move Tempdb data files to Data Drive, leave TempDB log files on TempDB drive
– Move Tempdb log to System Drive
– Consider External Enclosure add more drives to TempDB
– Consider RamDisk for TempDB (upgrade to OS required)
– Consider Solid State Disk ( SSD ) for TempDB (Are there any with SAS interface yet ?)
– Replace TempDB drives (drives are bad?)
– Replace Array Controller (controller is bad?)
– Trace Flag -T1118 (There are already multiple data files in tempdb, one for each processor core) I have already read Microsoft recommendations (KB articles and more, listed in previous threads …) Anything else I should check for ? Any additional suggestions ? Thank you for your help.
Hi , You covered most of ground beats me , but wanted to add soemthings in case u missed them 1> Optmize your maintainece palns if any 2>http://www.sql-server-performance.com/tp_analyzing_sql_server_2000_data_caching.asp 3>http://support.microsoft.com/default.aspx?scid=kb;en-us;243589 Cheers
Sat[?]
This error is reported by sql server to indicate that an IO operation has been issued, and it has taken over 15 seconds for that operation to return through the IO subsystem. The problem is certainly below SQL server in the IO stack, but the location can’t be pinpointed without further diagnostics. The KBA should help you to assesshttp://support.microsoft.com/kb/897284/en-us &http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx for a good understanding on IO.
Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.
i really think the 1 tempdb per core rule by itself is one of the really dumb rules,
yes, i know about the temp object creation bottleneck
but applying this rule where all tempdb reside on 1 pair of drives is really stupid, people write rules really need to be careful for now, i would go to 1 tempdb date file
or possibly 2, 1 in the current location, 1 on the backup drives
]]>