SQL Server Performance

SQL Server on SAN with RAID 10

Discussion in 'Performance Tuning for Hardware Configurations' started by bemall, Sep 7, 2005.

  1. bemall New Member

    I would like a breakdown of the setup for SQL Server 2000 in A RAID 10 SAN environment. OS, MS SQL Server, Data Files, Indexes, Transaction Log and Backups.

    E.g
    C: OS - Random - Local Drive
    D: MS SQL Server - Random - RAID 10
    F: Database Data Files - Random RAID 10
    I: Indexes - Random - Raid 10
    H: Transaction Log - Sequential - RAID 1 or RAID 10 ?, if this was a RAID 5 setup then RAID 1
    J: Backups - RAID 1 ?
    Q: Quorum for clustering - RAID 10 ?

    Thank you

    Beemall
  2. Luis Martin Moderator

    I've moved to hardware forum.


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  3. bertcord New Member

    Dont break up Indexes and data unless you can give me a good reason why..... dont use RAID 10 unless you have at least 20% writes. Use RAID0 for backups.... you are backing up to tape right? Quorum...this can be on the same LUN as OS...
  4. satya Moderator

    What is the database growth in a period of next 3 to 5 years, if it is huge then you can follow your setup and use RAID5 for Transaction log files.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  5. bemall New Member

    This setup is meant to accomodate for growth in concurrency and size of data. The indexes are pretty large, 600 MB to a GIG or so for the different tables plus ensuring seperate threads for the non-sequential index data.

    The backup is two step. The first is illustrated in my breakdown i.e to the backup drive before being grabbed by Tivoli in the back ground.

    Question:
    Could someone direct me to SAN optimization whitepapers for SQL Server.

    Beemall
  6. bertcord New Member

  7. FrankKalis Moderator

    quote:
    Question:
    Could someone direct me to SAN optimization whitepapers for SQL Server.
    Search the website of your SAN vendor. Most, like EMC, should have such information online.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Ich unterstütze PASS Deutschland e.V. http://www.sqlpass.de)
  8. FrankKalis Moderator

    quote:Originally posted by bertcord

    SQL Server will can create multiple threads per data file. YOu do nto need to create sepertate files for this.
    Bert,
    this is a thing where I'm uncertain. Starting with SQL Server 2000, SQL Server can start multiple threads to retrieve data, even when you only have one filegroup residing in one physical file. What's the advantage here? Doesn't multiple threads "require" multiple disk heads to read the data to be of any significant advantage here. From my understanding, multiple threads make more advantage when multiple physical disks are involved. Am I wrong?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Ich unterstütze PASS Deutschland e.V. http://www.sqlpass.de)
  9. satya Moderator

    To reduce overall I/O contention and improve parallel operations, consider partitioning table data and indexes.

    One of the MS guide refers:
    When configuring a SQL Server that will contain only a few GB of data and not sustain heavy read or write activity, it is not as important to be concerned with the subject of disk I/O and balancing of SQL Server I/O activity across hard drives for maximum performance. But to build larger SQL Server databases that will contain hundreds of gigabytes or even terabytes of data and/or that can sustain heavy read/write activity, it is necessary to drive configuration around maximizing SQL Server disk I/O performance by load-balancing across multiple hard drives.


    An effective technique for improving the performance of large SQL Server databases that are stored on multiple disk drives is to create disk I/O parallelism, which is the simultaneous reading from and writing to multiple disk drives. RAID implements disk I/O parallelism through hardware and software.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  10. bertcord New Member

    quote:Originally posted by FrankKalis


    quote:Originally posted by bertcord

    SQL Server will can create multiple threads per data file. YOu do nto need to create sepertate files for this.
    Bert,
    this is a thing where I'm uncertain. Starting with SQL Server 2000, SQL Server can start multiple threads to retrieve data, even when you only have one filegroup residing in one physical file. What's the advantage here? Doesn't multiple threads "require" multiple disk heads to read the data to be of any significant advantage here. From my understanding, multiple threads make more advantage when multiple physical disks are involved. Am I wrong?

    Hey Frank just realized you had asked me a question.

    Yes I agree with you, if you only have one disk having multiple threads in my view would not help too much. But if that drive letter is an array with multiple drives then it could be different; however I will let SQL Server determine how many threads to make. I did some test a few months ago to determine if multiple files improved performance on the same drive. I concluded that it did not and in some cases resulted in worse performance. As part of my testing I placed about 4 files on the same drive letter, my data loads did not perform any faster and I noticed higher context switching and Disk q's. IF you are interested I can send you the paper I wrote for our internal DBA's on the subject.

    The main reason I was pointing this out as I see many people still creating multiple files because they think it will perform better, when this is not the case.

    Bert
  11. FrankKalis Moderator

    Thanks Bert! That would be really good of you. I'll drop you a PM via the forum. And in any case I'll keep it confidential, if you wish. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Ich unterstütze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>) <br />
  12. derrickleggett New Member

    Bert, I think you're encountering some issues people currently have when they implement SQL Server on a SAN environment. A lot of the ideas people have are wrong because all the variables have changed that created those ideas. SAN equipment (especially high-end SAN equipment) has a much more efficient processing of data due to an enhanced backplane, configurable and optimizable pathing, storage processors, and cache. Cache, cache, cache, cache.<br /><br />Here is what I usually try to do in a SAN environment IF I can get dedicated disks just for the SQL Server:<br /><br />Tempdb - Dedicated RAID 10 if I can get it. RAID 5 is usually fine. You don't need a lot of disks for this usually.<br />Data - I usually go with RAID 5. You might want RAID 10 for a really, really high transaction system.<br />Log - RAID 10. That's dedicated RAID 10 if you can get it. It makes a big difference.<br />Backup - RAID 5 is usually fine. I'm not sure why you would go with RAID 1. If you don't care about redundancy, you might consider RAID 0 for speed and cost. Go with RAID 10 if you want lots of speed, lots of redundancy, and cost just doesn't really matter. If this is the case call me. I'll work a contract for you.<br /><br />On a shared disk SAN, the variables above just don't matter that much anymore. You are going to find that performance of the data and log space will suffer significantly though if you don't have enough dedicated spindles, when you take into account spindles vs total load of SAN (not just SQL Server load). A lot of SAN engineers who haven't worked with SQL Server will give you the official vendor spill and lead you down a road to disaster. There is NOTHING that replaces the maximum IO capacity of individual disk drives and multiple IO points. <br /><br />Hopefully, it will give you something to think about. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />] Look up Joe Chang. He has a lot of interesting information and metrics on SAN and SQL Server. It will be worth reading.<br /><br />MeanOldDBA<br />derrickleggett@hotmail.com<br /><br />When life gives you a lemon, fire the DBA.
  13. bertcord New Member

    Derrick,

    I would agree with everything mentioned above except for TempDB. Why dedicate drives to TempDB? If you TempDB is highly active having it spread across all of your spindles would help it out. IF TempDB is not active then you would be wasting those spindles. Also if you are to dedicate spindles to TempDB why RAID5? TempDB has a large % or writes so the RAID5 would hurt you.

    Bert
  14. derrickleggett New Member

    It would depend on the setup of the system and usage of course. We used to put our tempdb on RAID 1 drives on each of the servers themselves, even in a SAN configuration. We found the tempdb on SAN RAID 5 was much more efficient due to SAN cache. If you can afford to put it on RAID 10, then that's better yet though. Normally, you try to minimize the use of tempdb anyway; however, when that's not possible the answer is still invariably the more dedicated spindle to IO ratio, the better the performance. <br /><br />I think we probably actually agree on this one. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />MeanOldDBA<br />derrickleggett@hotmail.com<br /><br />When life gives you a lemon, fire the DBA.
  15. biged123456 New Member

    I have heard somewhere that you should not use RAID5 unless 99% of your disk activity is read related. That certainly does apply for TempDB or Transaction Logs. If we are talking about servers that are not going to push the disks no matter what configuration you have, I guess it probably isn't worth discussing in the first place...
  16. derrickleggett New Member

    I have heard somewhere that you should not use RAID5 unless 99% of your disk activity is read related.

    That certainly does apply for TempDB or Transaction Logs.


    I'm confused. How do those two match up? Am I missing something? I disagree with the RAID 5 statement completely. Read/Write percent doesn't come into play until you reach certain points in the IOPS to maximum capacity of the drives themselves. When you start reaching higher IOPS, the differences in RAID types becomes extremely obvious. RAID 5 is still a decent solution most of the time though if you're not over 30% write. RAID 10 is always good if you have the budget. If the budget is limited though, the transaction logs followed by tempdb are some of the best places to use it.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  17. agw New Member

    I'm soon going to have a new server with a SAN in RAID 10 which will be dedicated to the SQL Server. The DB will be in excess of 500G with the majority of the data located in 4 key tables which contain about 100M rows each and they're subject to frequent updates and inserts with many indexes.
    When configuring the logical disks, do you recommend having multiple logical drives within one array or a few arrays with one logical drive?

    Thanks in advance for your suggestions

    Andrea
  18. derrickleggett New Member

    Neither. I would recommend having dedicated physical drives for each LUN (will be presented to the host as a mount point or drive letter). Seperate the log file from the data file. If tempdb gets hit a lot, do the same with it on seperate drives. If this is really high IOPS, then use as many drives as you can.

    If you have to go with the scenarios you described, I would have multiple logical drives with one array since it's RAID 10. That would probably not be my recommendation if it was mixed or RAID 5. What kind of SAN is this?

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  19. agw New Member

    Ok, let's go in order...
    The SAN is the HP 1000 which will have 14 drives in it. First I need to configure the Array(s)
    the logical drive(s) and the place the various DB files on the disks.
    At which point are the LUN defined?
    The server is really high I/O with a lot of writes due to continuous updates and inserts.

    Thanks Derrick

    Andrea
  20. derrickleggett New Member

    What size and speed of drives. What sizes do you need for the log and data files? How big does tempdb need to be? What size and speed of drives are on the server itself? I'm assuming this post is related to the other thread you have going? Also, so you mean the HP MSA1000?


    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  21. agw New Member

    The drives are 143G Each 15K U320. Log hardly ever exceeds 20G (in FULL recovery mode) but typically is about 1 to 2 GB. Currently I have 14 data files of anywhere between 4GB and 20Gb each for a total or 200GB but in the new server the total size will be in excess of 500GB. I'm not sure how big TempDB NEEDS to be, right now is about 2GB and it is on a separate disk array (physical, controller...) it's actually on the C drive with the OS (bad I guess) which is in RAID 1. I'd like to put it in the right place on the new server though.
    The new SAN is indeed the MSA1000 and yes the two threads are related.

    Andrea
  22. derrickleggett New Member

    Ok. You might be able to get by putting tempdb on the local server. You'll get four drives with this server I believe. I would definitely monitor it though to make sure you're not getting concurrency issues or queue lengths on the server. In regards to setting up the MSA1000, if you only have one shelf and need one of the disks for a hot spare, you're probably going to have to just make one big RAID 10 config. If you don't, you're going to have 4 + 9 in two RAID 10 configs. That's only going to give you 572GB of usable space for the data drives. I'm not sure from what you're saying if that will actually be enough. If so, I would recommend the 4 + 9 in two different configs.

    A note on the MSA1000. The channels only run half way across, so IF you go with two arrays, split each one of of them so you have half the disks on each side. Also, how old is this unit? Depending on age, you might want to talk with HP and make sure it's active/active. The old models are not.

    Make sure you run multiple paths to this server also.

    When you say you had multiple data files, was it multiple database files but one database, or do you actually have more than one user database?

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  23. agw New Member

    All good advice, thanks.
    I did not think of putting tempdb on the local server but I guess it would help with performance. Are you suggesting setting up two RAID 1 arrays? One for OS and one for tempDB on the local server?

    The MSA is going to be only 1 shelf but according to my calculation with 14 drives and two spares I should get 6 X 143 = 858GB in RAID 10, isn't that so?

    This will be a brand new unit so I should have no issue with what you mention.

    Good on the multiple path, see my other thread...

    Currently I have multiple files (14) for the same database. They're all primary and vary in size. I also have some other databases but they're small and should be totally harmless.

    Andrea
  24. derrickleggett New Member

    I would setup two RAID 1 arrays on the server drives like you mentioned. Use the 15K drives if you can. Make sure you have a good RAID controller for the drives.

    You will get the 858GB. I like to split the arrays up, so you have log on one and data on the other. That would be cutting your space awful close though.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  25. agw New Member

    Very good advice on the use of internal drives, frankly before this chat I would have put everything on the SAN.<br />I'll make sure I get the best possible drives and controller for the two local arrays and use one for tempDB and one for the OS.<br /><br />On the SAN, I could split the log and the data by creating two arrays but as you said my available disk space would be too small for my DB size. Also, no matter what, the at the end of the day all data would need to go through the same path (HBA&gt;Switch&gt;Controller&gt<img src='/community/emoticons/emotion-5.gif' alt=';)' /> the only benefit would be no contention for the actual drives but as a drawback one would have half the spindles to work with per array.<br />I guess, ideally, one would use another shelf just for the log file. Actually I have an old MSA1000 which we never used however everything is 1Gb bandwidth (HBA and Controller). Maybe I'll get a few smaller drives and use the second shelf for log files and drive backups. What do you think?<br /><br />Andrea<br />
  26. derrickleggett New Member

    That sounds like a good plan if you really want some throughput on this thing....especially if you can do it inexpensively.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  27. agw New Member

    I think I'll seriously consider this solution although I'm now waiting to see what Joe says about SAS...
    What do you think?
    Andrea
  28. derrickleggett New Member

    [<img src='/community/emoticons/emotion-1.gif' alt=':)' />] I replied. I don't think about SAS. You can read my reply over there.<br /><br />MeanOldDBA<br />derrickleggett@hotmail.com<br /><br />When life gives you a lemon, fire the DBA.
  29. agw New Member

    While we wait for Joe to shed some light on his thoughts I forgot to tell you that today I collected some data off the profiler and surprisingly enough it is my READ queue length which goes as high as 472 with an average of about 250. The Write queue lenght seems just fine at 0 average with some occasional trips into the 70.

    Where do I find the other counters you were mentioning? I couldn't find them under Cache.

    Andrea
  30. derrickleggett New Member

    Look under SQL Server counters. SQL Server Buffer Manager and SQL Server Cache Manager

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  31. agw New Member

    Thanks much Derrick, both counters (Buffer and regular cache hit ratio) are flat at 99.98 average while the average queue length is high. What does that mean?

    Also, I noticed that occasionally both read and write queue length are going up high almost equally, could that be because both my logfile and my data file are on the same array?

    Andrea
  32. derrickleggett New Member

    You're probably just soaking your IO capacity on those disks if your total wait time for both read and write is going up like that. That really is one of the reasons people should have more disks and dedicated, seperate disks for log files and data files if possible. Remember, the log file is extremely different than the data file in usage patterns and requirements.

    Try to get that other shelf if you can for this. It's always going to help if you have more spindles to throw at it.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  33. agw New Member

    Thanks Derrick,

    on the current storage I have 8 drives; 2 spares and 3 on each side of a RAID 10 so as you can see not many spindles...
    Since I have 6 empty slots, I'm thinking for sake of simplicity of adding 6 more drives (3 and 3 in RAID 10) and move there my log file.
    This way I'd have tempDB on one card and one array (RAID 1 sharing the OS),
    the log file on a dedicated array and the data files on a dedicated array with the last two sharing the controller card which is a dual channel anyway.
    I think given the current situation, this would alleviate some of my IO contention. Next step would be increasing the memory to 32GB which would let any of my tables (the largest being 29GB right now) fit in memory.
    Andrea
  34. derrickleggett New Member

    That would be a cheap, immediate solution and probably help you better analyze the solution. It's amazing how cheap disk drives are now IF you already have the capacity to place them.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  35. dtipton New Member

    quote:Originally posted by agw

    Currently I have multiple files (14) for the same database. They're all primary and vary in size. I also have some other databases but they're small and should be totally harmless.

    Andrea

    What is the advantage of having multiple database files for a single database? Especially if they are all going to be on the same set of physical disks.

    Thanks
    don
  36. agw New Member

    Well, Don,

    I thought that it would be easier to get the physical files defragmented this way as with a total of 250GB of disk space and a DB approaching 200GB it would be impossible to defragment a single file.
    However I know have many doubts, one is what is the point of having the physical files defragmented when in fact the physical file shown to the OS is the result of what the RAID controller puts together?
    More so, if theoretically, the files were indeed contiguous what would be the benefit of having multiple spindles? I think maybe one would get better performance out of a fragmented drive instead.

    Can anyone share his view?
    Thanks
    Andrea
  37. satya Moderator

    If the filegroup comprises multiple files spread across various physical disks, each with its own disk controller, then queries for data from the table will be spread across the disks, thereby improving performance. The same effect can be accomplished by creating a single file on a RAID (redundant array of independent disks) level 0, 1, or 5 device.

    It is advantageous to get as much data spread across as many physical drives as possible in order to improve throughput through parallel data access. To spread data evenly across all disks, you can place a single file across striped disks or maintain each disk separately and place a file on each disk.




    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  38. agw New Member

    Hello Satya,

    The filegroup does comprise multiple files and I'd assume that they would be spread across all disks as the disks are attached to a RAID controller in a RAID 10 configuration. So in this configuration each single file is spread across multiple spindles. The original question still remains though; is it advantageous to have the physical files defragmented at the OS level when they are actually fragmented over a number of physical disks (as it is with RAID 10)?
  39. satya Moderator

    IT is not ideal to keep physical files fragmented, in this case you need to have a downtime to defragment the Operating System files.

    Unless SQL Server is running on a machine with enough RAM to hold the entire database, I/O performance will be determined by how fast reads and writes of SQL Server data can be processed by the disk I/O subsystem. It is best to concentrate on disk queuing instead of on the actual I/O for each disk. Disk I/O speeds depend on the transfer rate capability of the drives, which cannot be adjusted.


    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  40. agw New Member

    So, if I understand correctly, it is ideal to have the Data files of a database properly defragmented at the OS level regardless of whether they are on a single disk or an array of disks controlled by a RAID controller.
    That said, the larger the files the more difficult and time consuming is to defragment such files so the idea of having multiple Data files is not a bad one. However when the database is spread across multiple files so are the indexes (unless properly dedicated to a filegroup using one file only). When indexes are spread across more than one file, as I understand it, logical fragmentation will occur so it seems to me that in order to avoid this, Data files large enough to contain specific indexes or all indexes should be created.
    Lastly according to Microsoft testing Physical Files Fragmentation affects performance less on large systems with a more robust disk subsystem.
  41. bertcord New Member

    I would not owrry about Physical fragmentation too much unless you have large amounts of Sequentail access. For my DBS I have mostly 8KB Reads and writes. It doesnt matter if the data is fragmented or not

    Bert
  42. agw New Member

    That makes sense...I'll check that.

Share This Page