SQL Server Performance

Dell MD3000+MD1000 or MSAXX or roll your own?

Discussion in 'SQL Server 2005 Performance Tuning for Hardware' started by rtowne, Jun 19, 2007.

  1. rtowne New Member

    I have a SQL 2005 database that is currently at 3 terabytes. It is expected to grow to 10 Terabytes at current growth patterns within the next 14 months.

    I am currently migrating to a more powerful server and choosing between Dell and HP.
    * The HP choice is DL585 g2 4x2.8Ghz unless someone can convince me the 3.0Ghz is worth the extra $), 32 GB RAM, x64
    * Dell Choice is PowerEdge 6950 same RAM & CPU Speed/Count.

    The servers seem very comparable, the additional slots on the HP (32 vs 16) make it easier to ramp up on the cheaper 2GB sticks, but with the size of my database -- it comes down to storage performance.

    The MSA70 and MSA50 both support the 2.5 SFF SAS drives, currently biggest size is 72GB for 15k. At RAID 1+0 for most of my data 200 drives = 7 TB of space which would give me > 50% capacity to grow (assuming I'd want to go over 80% drive capacity utilization).

    Staying with HP, the other options are going with 2.5 SFF 10k drives to move up to 146GB or 300GB -- or go to the MSA60 3.5 LFF.

    Each option allows cascading, but I'd like to hear what others think about the wide SAS ports and if they think cascading = overloading or it can handle the pipe.

    From the reading I have seen so far, most agree that you shouldn't cascade but just add another controller (P800 since it has so much cache?) to distribute the load. (the P800 is PCI Express and you only have 3 x8 PCI Express and 4 x4 PCI Express slots)


    The Dell if the second option, for storage you have the MD3000 and can attach up to 2 MD1000's for scaling. The Dell options are only 3.5 LFF drives currently. The array controllers are similar to HP AFAIK.

    Another option is choose the best server and go with a SAS fabric of SAS Controller/expanders, kind of rolling your own with less known names for enclosures and finding the fastest disks, etc and start growing my internal SAS fabric making that my pattern for expansion. this carries a good bit of risk since most combinations of the architecture aren't well known and the community is much smaller. I know all the parts are just OEM'd and LSI and such internals can be found in Dell/HP and everyone else -- but finding solid combinations are hard to find so far. I really like the 'database brick' mentality shown in the SQLMAG articles and they IO they got compared to SAN for the B&N BI project/prototype -- and I am thinking the faster 15k SFF SAS drives would blown their SATA drive prototype out of the water -- but for now I don't have the knowledge to assemble such a beast or a community to help me in case of problems.

    So, I know this is a huge post, but if you had to choose between the 3 -- how would you go? I am assuming SFF 15k 146 drives are probably right around the corner and that might help reduce the # of enclosures and still keep the data on a good # of spindles.


    Thanks for all your comments!

    Robert Towne[^]


  2. rtowne New Member

    additionaly, i forgot to mention/ask about folks using the Dell MD3000 for SQL Server databases. Has anyone used their "Snapshot" or "Virtual Disk copy" software features that are options for this model? I was just wondering if it were possible to snapshot a live SQL database using this probably via VSS, or how this worked when with multiple MD3000's.
  3. joechang New Member

    right this very day/month is not a good time to buy such a large system
    AMD should have quad core Barcelona in the next 3 months
    the Intel Tigerton/Clarksboro is right around the corner

    also, current HP/Dell SAS controllers are built around the older 8033x controllers or equivalent
    the new 8134x are right around the corner

    bug both HP and Dell about the next gen 4 socket platforms and SAS controllers
    depending on exactly what you are doing
    consider the Intel and AMD x86/64 platforms can only drive 3-4GB/sec disk IO
    the HP Integrity rx6600 with the sx20000 chipset can do 18GB/sec

    see my posts on hardware and storage

    its not that you cannot cascade external storage units,
    but first fill each PCI-E slot with a controller,
    don't worry if its x8 or x4, the controller itself cannot drive more than 1GB/sec, so who cares

    build your own is really more suitable for solution providers needing a canned package, for which they invest the time to find one combination that works well together for multiple uses, ie, sites
    for a one time thing, stick with HP/Dell, then complain to them if you have problems


  4. markmin New Member

    I was planning to buy a HP MSA 70 and fill it with 15k 36GB drives to speed up my .mdf partition. I have the P800 controller. Will the MSA 70 work with the new 8134x controller? How long should I plan on waiting for the new controller and storage units? I'm assuming the 2.5 SAS drives will be usable, right? Thanks!
  5. joechang New Member

    storage controllers are cheap
    buy the MSA70 now with the P800

    buy a new controller later,
    of course switching controllers means deleting the entire partition
    so this is a fullbackup and restore

    actually, for a single unit
    i think you might get 2 or 3 MSA50 with 10 SFF drives each,
    each MSA50 on its own controller

    instead of all 25 drives in the MSA 70 on 1 controller
    i know its more expensive, but the MSA70 only makes sense when you fill all the PCI-E slots and still need more drives
  6. markmin New Member

    I'd buy two MSA50s, but I'm not clear on how to create one partition across two controllers. Could you point me to a resource for setting this up? Thanks!
  7. markmin New Member

    I know the p800 has two channels. In the HP array config utility, is it possible to create a single RAID using drives attached to both channels?
  8. joechang New Member

    i do not think the HP controllers let you span controllers

    in any case, do not bother

    1 controller connects to 1 MSA
    make 1 or 2 arrays on each MSA

    create multiple data files for the big db, 1 data file on each each controller, except the controller than handles logs, and internal disks

  9. rtowne New Member

    &lt;1 controller connects to 1 MSA<br />make 1 or 2 arrays on each MSA&gt;<br /><br />Is there no advantage to having 1 volume across more spindles (if msa70 would that be 50 disks?) I am in favor of the DBA knowing how to strategically get the most i/o rather than creating 1 big blob -- but I'm not sure I understand the reason why this is best..<br /><br /><br />So the P800 features state the following: "HP's highest performing 16-port PCI-express controller with expander support" ... what is this "expander support" -- i think of SAS expanders say - from Vitesse or someone -- is this what they mean?<br /><br />A Single P800 is stated to support up to 100 drives externally - this is 4 MSA 70's. It has 16 SAS physical links across 2 internal x4 wide and 2 external x4 wide connector ports. What are the points of failure in having 4 enclosures attached to 1 P800? <br /><br />Is there any reason why I couldn't have 7 P800's in a DL585 (which has 7 pci-express slots - some x4, some x<img src='/community/emoticons/emotion-11.gif' alt='8)' />?<br /><br />(I Know lotsa questions here, but I am really digging this thread)<br /><br />Thanks!<br />Robert
  10. joechang New Member

    the important point #1 is distributing data across as many disks as you,
    whether its 1 big file across 2*n disks
    or 2 files, each file on n disks
    the data is still over 2*n disks

    point #2 is to spread the data across a reasonable number of controllers
    10-15 per controller is a good starting point for current generation SAS controller - see other posts on this

    now think about this,
    you want to add more disks, ie, another MSA 50

    do you want to backup the db,
    drop the database,
    delete the existing partition,
    create a new partition
    restore the original db

    or do you just want to add a controller
    add a data file?
    SQL will automatically start filling the new file
    a reindex will speed this up

    HP is normally a good vendor, with a good selection of parts
    but notice they have no document with meaningful performancce info?

    i suspect the P800, like other current generation SAS controllers (prior to the Intel 834xx)
    cannot get more than 800-900MB/sec per controller
    so there really much of a point featuring 4 x4 SAS ports
    it can only drive 1 x4 port, but having 2x4 is better

    yes thats the idea, fill you PCI-E slots with SAS controllers,
    1 MSA to each controller
    do this before you start sticking too many disks per controller
  11. rtowne New Member

    one thing i noticed after pricing the dl585 against the poweredge 6950 and looking closer at components was the redundancy paths in the DAS storage. it is my understanding that the dl585 can connect up to 8 enclosures - but only 1 connection can go to the first msa60 - then it is daisy chained with a single cable to the next msa60, which has an out to the next msa60, which has an out to the next msa60. even if you wanted, you couldnt have dual/redundant paths to each msa60. so, if you have a p800, the first 8 enclosures can be connected directly to 8 enclosures without any daisy chaining - at least your only point of failure is a single enclosure or (gasp) the p800. i thought you would be able to use 2 p800's and provide controller redundancy or even mpio but that isnt the case.<br /><br />the dell md3000 uses a perc 5/e which has 2 ports out - which both could go to 1 md3000 with 2 raid controller modules. each md3000 can have 2 connections down to the next md1000, and so on (up to 45 drives w/3 enclosures). so far everything is redundant but perc5/e -- but you could connect 2 of those into your server and spread 1 cable from each perc 5/e to a different md3000 controller - having redundant paths from server to disk (assuming disks are raided somehow <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br /<a target="_blank" href=http://support.dell.com/support/edocs/systems/md3000/en/IG/HTML/hardware.htm>http://support.dell.com/support/edocs/systems/md3000/en/IG/HTML/hardware.htm</a><br /><br />i am not sure how the 2 paths to the md3000 controller are redundantly managed - i was told mpio was possible if using 2 perc 5/e's to 1 md3000 with 2 array controllers.<br /><br />i assume the main difference between the md3000 vs msaxx is that the dell uses hba's to connect to the md3000 which houses the array controllers -- the msa array controllers are back in the server and not down in the msaxx. this causes the msa to provide much more fault tolerance in my book if i understand correctly (as long as you buy the dual modules over single).<br /><br />if anybody has scaled out the md3000 with redundant controllers let me know if you use microsoft mpio or ? to allow active/active connectivity - or tell me if i misunderstood something..<br /><br />thx<br /><br />robert towne<br /><br />
  12. joechang New Member

    it is my impression that too many people try to "buy" fault tolerance, single point of failure

    I believe reliability is an operational skill
    that is, you practice the recovery from any point of failure

    i have seen too many people buy what they believed was a S-P-F resistent config, encounter a failure,
    the system is still up and running
    then because they never practiced this,
    crash the system while trying to replace the failed part

    so i still believe operational skill is more important than trying to do mpio
  13. rtowne New Member


    I believe operational skill is also much more important than trying to do mpio, however, I don't think that arguement justifies not making your system more resilient.

    something like mpio just means I can take a system down on my own terms and replace a part (if it isn't hot swappable) rather than doing it at a forced time which might just happen to be in the middle of the day. on an order entry system where an hour could mean tens of thousands of dollars lost revenue or more -- if you didn't create a resilient system (via replication, or fault tolerant h/w parts) -- then you probably won't have a job much longer to practice your operational skills.

    learning how to restore quickly is a great operational skill, but in a database with multiple terabytes any action is going to take some time. maybe mpio would allow a failing controller to be taken over rather than creating some corruption in the database and save me from having to prove my operational skills.

    If you don't try to create a situation that reduces the likelihood of failure then maybe you are in an environment where it is ok to be down for extended periods, and that is very justifiable versus spending more money to add resilience. just a viewpoint.
  14. joechang New Member

    in that case, i suggest you use the MD3000 in a cluster
    not a mpio to a single server

    my own impression of MS is as follows
    when all the core developer work on a particular hw to develop the code,
    every thing works reasonably well
    when the hardware is considered exotic,
    the code may work, but behaves funny,
    because the coders do not live and breath the stuff

    on many of the SANs with multi-path io, disks would appear twice,
    technically the system was supposed to have fault resilence, with hot swap replacement of failed parts
    but for everything to actually work right,
    an OS reboot was required,

    so much time was spent figuring out how the system behaved, any justification for the hot fix capability was just total bs

    i suggest you do a hot/warm standby system one way or another
    the mpio may seem valuable in theory,
    but you will spend so much time on it, all your arguments for it ...
  15. space_boy79 New Member

    I too am considering the pros & cons of an MSA70 vs MD3000.

    Am I missing something but I would expect that with 25 SAS drives connected to only one X4 SAS link you would bloat the X4 link before you ever pushed the beyond their capability... surely 2 MD3000's would be much better offering dual X4 links, redundancy and of course with only 15drives per chassis you would be experience less contention across a single X4 link compared to 25?
  16. joechang New Member

    if you do pure sequential, you can get 80-120MB/sec per disk depending on the disk, the newest Seagate 15K.5 doing the 120

    The 10 internal disks on my PE2900, all Fujitsu 15K, can do 795MB/sec on a table scan
    My MD1000 with 15 disks can only do 780MB/sec,
    i suspect it is because there is a mix of Seagate and Maxtor drives with slight mismatched characteristics
    (HP and Dell do not let you specify the disks, but I would bug them to ensure all are matched, vendor, model and possibly even firmware)

    800MB/sec is probably also the limit for x4 PCI-E and x4 SAS

    now it is very hard to do pure sequential in SQL operations, unless you go to extraordinary measures to prevent fragementation, ie, a dedicated filegroup for just the data of the largest table or two)
    realistically, you might be doing 64K some what sequential, meaning you might get 20-30MB/disk

    if i had less than 60-90 disks, I would first want to have 1 SAS controller in a x4 PCI-E slot for each rack of 10-15 drives, prefering the HP ML370 with 6 PCI-E x4 slots
    over the Dell PE2900 with 4 PCI-E slots (2x8, 2x4?)
    when i have 120+ disks, its no big deal having 1 MD3000 chained to 1 MD1000 per SAS controller
    or 1 MSA70 per SAS controller
    but for the 40-60 disk range, I suggest the MSA50 or MSA60

  17. space_boy79 New Member

    I guess the problem I have with MSA50/60 is the lack of drives in Raid 10 with a hot spare. Using a MSA60 (12 drives) i'd only get 5 + 2 hot (or 6 & no hot), whereas using the MD3000 (15 drives) I'd get 7 + 1 hot.

    I know you can daisy chain the MSA's but I'd essentially be putting all my eggs in one basket doing so, one failure on the MSA or P800 would bring down a whole heap of drives.

    We are hosting multiple med transaction HRIS databases on SQL2005. The application handles its own recovery therefore they are only placed in simple recovery model.

    The configuration i'm currently looking at are;

    Dell6950 with 2 MD3000's in Raid 10 (one data, one log). Due to the DB's being simple recovery model the data array would have 15 x 300GB and log is 15 x 150GB. Each MD3000 would have its own HBA in the 6950 and would ultimatly look at adding a second 6950 down the track for two active/ative nodes.. with ability of one node handling all processing in a failure (obviously at reduced capacity).

    Am open to any other suggestions however would prefer HP or DELL.
  18. joechang New Member

    unless you intend to get more than 4 MD3000,
    i suggest the PE2900 with two Quad Core
    over a 6x50 with four Dual Core
    unless you are expecting the time frame to be Sep, when Quad Core in socket systems may become available
    the price delta may only be $6-8K HW,
    but if you are on per proc licensing, SW cost is susbstantial

  19. space_boy79 New Member

    I hadn't considered the PE2900 as I don't think 2procs are going to cut it.

    Ideally what we're after is the fastest I/O subsystem possible without going to the expense of a SAN with FC etc.

    Does anyone know if the MD3000 actually has a total bandwidth of x8 when connected to two seperate HBA's in the same server? Or is the second controller purely for redundancy only?
  20. joechang New Member

    the fastest IO possible is to avoid a SAN
    a SAN is a computer system that servers up partitions
    like the way a file server provides access to files

    with direct IO, the path is host to HBA to disks
    with SAN, it is host to host HBA to SAN HBA (host side) to SAN processor to SAN HBA (storage side) to disk

    a SAN provides additional capabilities over direct attach
    but it adds layers
    how much better would you perform if you had additional layers of management before reaching the decision maker?

    most of the IOP33x controllers can only drive a single x4 channel
    i did not seem any bandwidth improvement on the MD1000 when in split mode

  21. space_boy79 New Member

    OK since DAS is the way to go and non-sequtional is only giving you 20-30MB per drive then it would be safe to assume that the fastest chassis would be the MSA70?

    Downside being a fully loaded MSA70 with 25 SFF 72GB 15K drives in RAID10 only gives 864BGB vs MD3000 with 15 LFF 300GB 15K drives gives a lovely 2.1TB.

    Also, I'd assume that MSA70 having more spindles in use would then be able to handle a higher queue depth better than the MD3000?
  22. joechang New Member

    the fastest storage is multiple controllers, each connected to one storage enclosure,

    if you have a PE2900
    put in 4 PERC5/E controllers and 4 MD 1000's
    or 4 SAS HBA's and 4 MD3000's

    ie, 15 73GB 15K drives in each MDx000

    or for HP

    6 P400 controllers
    6 MSA50 each with 10 36G 15K drives
    or 6 MSA60 each with 12 73GB 15K drives

    DO NOT get a single storage unit filled with 300G drives
    this is the mistake so many people make
  23. markmin New Member

    I was planning to attach 2 MSA 50s to the two external ports on the P800 controller. Would performance be measurably better with a second P800, with each controller supporting just one MSA 50?
  24. joechang New Member

    there would be no difference in random IO,
    only sequential IO

    it is seriously annoying that HP does not have a PCI-E SAS controller with 2 external x4 SAS connectors
    the P800 is somewhat over-priced because it has 4 x4 ports,
    but 2 are external, which i think is silly

    anyways, i have seen no indication that this generation of SAS controllers can drive more than 800MB/sec,
    so 2 P800 should drive more,
    the issue is that the MSA50 only holds 10 drives,
    which is just border line to warrant one controller per MSA,
    especially the expensive P800



  25. clm100 New Member

    Just wanted to jump in and ask a tangential question. Hope nobody minds. I've found this board to be a great resource so far.

    We're quickly running out of space on our primary database server. This server houses many (71 right now, and growing) databases ranging from 100MB to 150GB. It's used for analytical work almost entirely. This consists of a couple (literally 4 at the absolute most, usually just one) DBAs/Analysts doing complex queries and large updates to big tables. The only transactional access is when a few external processes go through and, for example, clean up or calculate a specific field on a specific table. At any given time most of the databases are idle and only one or two are getting used.

    It's a 4x Opteron 270 (that's 8 cores at 2.0GHz), 16GB of RAM, Windows 2003, SQL 2000 (both patched current). In terms of storage we currently have:

    * 3ware 8000 with 2x SATA 250GB drives in RAID1 for c:
    * 2x Adaptec 2230SLP connected to 2x Dell PowerVault 220S
    * The first powervault contains 9x Fujitsu 73GB 10k SCSI drives in a RAID5. Formatted this comes to 546gb. This is used for a second tier of online databases that are used only occasionally but need to be live and reasonably fast.
    * The second powervault contains 14x 146GB Maxtor Atlas 15k II. This consists of a 2 drive RAID1 for logs (we use simple logging for everything, no need for shipping or anything else) and a 12 drive RAID10 (formatted 820GB) for our main databases.

    We're now looking to expand and get something in between. A large RAID10 that's fast (but not the fastest) and doesn't cost a lot of money. Our current thought is the Adaptec 3085 controller (we have 2 PCIe slots free, but it has to be low profile) connected to a Dell MD1000. That would have something like 10-14x 300GB 10k SAS drives, formatted RAID10. This would provide plenty of storage for the majority of our databases, and current priority databases would be moved on to the large 15k array.

    Does anyone have any thoughts on that specific setup? Would we be better with another controller (such as LSI?) or enclosure? We're trying to do this cheap, and are looking at the Dell Outlet for refurbs (the 9x 73GB 220S powervault was nearly free from the dell outlet).
  26. joechang New Member

    my experience is there is value in sticking with single vendor components within a system,
    of course i will buy memory from crucial is there is a significant delta

    i will also buy bare disk drives, and the drive mounts

    but mixing too many different controller will eventually cause headaches
    if you are doing analytics, it is a big to get to 64-bit, OS & SQL Server, ie SQL 2005
    there is probably not much of an issue adding SAS drives to the current SCSI setup,
    but stick with the PERC5/E if you go with the MD1000
  27. clm100 New Member

    Why stick with the Perc5/e? We had some major issues with the PercEs. They don't really implement RAID10. They do Raid1 + Spanning.

    EDIT: We have SQL 2005 on some other servers. While it does have some nice features we all universally find SQL 2000 significantly easier to use and DTS much easier. As a result we're not really eager or interested in jumping to 2005. Perhaps when 2008 comes out we'll move to that.

    EDITED again: There's some more info about RAID10 on Perc 4 controllers here.

    Page1:http://support.dell.com/support/edocs/software/smarrman/marb34/en/ch3_stor.htm#1037043

    Page 2:http://support.dell.com/support/edocs/software/smarrman/marb34/en/ch8_perc.htm?c=us&l=en&cs=04&s=bsd
  28. joechang New Member

    RAID 1 + Span is RAID 10 = 1 + 0 = (mirror, then strip)

    Dell had serious issues with PERC 2 or 3, several years ago
    they had one model, but it could be either an Adaptec or LSI
    i think the Adaptec had the problems

    the PERC4/5 use the Intel IOP and LSI chips, both should be OK
  29. clm100 New Member

    joe, Span != Stripe.

    Perhaps I'm misunderstanding you, but the issue with the Perc 4 (and it was a confirmed issue, read those documents I linked) is that it implemented RAID 1 with concatenation. This gave you the appearance of RAID10 (it was redundant and the size of N/2) but it wasn't. The Perc4 implemented RAID10 the way disk spanning is RAID0.
  30. joechang New Member

    my mistake, stripe
    i tested the performance characteristics of the Intel SCSI RAID controller, which is same as the Dell PERC4, same drivers,

    you are probably misinterpreting whatever was writen
    i suggest you test the performance characteristics,
    it behaves as RAID 10, which is all i care about
  31. clm100 New Member

    We did extensively test it. We spent several weeks going back and forth with Dell Engineers about this issue. I can dig up some old emails on this with iometer and perfmon test results, if you like.
  32. joechang New Member

    it not worth the effort for the PERC4, this card is already obsolete

    the PERC 5/E is essentially the same as the Intel and one of the LSI PCI-e SAS RAID controllers

Share This Page