SQL Server Performance

Can't find great article by Joe Chang

Discussion in 'SQL Server 2005 Performance Tuning for Hardware' started by danielreber, May 22, 2007.

  1. danielreber New Member

    I am trying to find an article/post from Joe Chang that I read that goes over the merits of a SAN vs. local disk enclosures. It discussed how the bytes/second would be lower using SANs and other points. Can someone post the link and any other links that would help me decide on SAN vs. local disk enclosures on a multi-terabyte data warehouse.

    Regards,



    Dan
  2. joechang New Member

    i do not recall such an article
    most of my papers are on
    http://www.sql-server-performance.com/joe_chang.asp

    My technical complaints against SANs are that most are built around an obsolete system architecture so that a single SAN could not handle a heavy sequential load
    the CX-500/700 were built around the ServerWorks GC chipset, which could only support 1.3GB/sec disk IO,

    then to compound this, the SAN is grossly over priced so many people could not afford an adequate number of disks, not to mention IO channels

    for a multi-TB DW, my preference is to have the equivalent sequential bandwidth of 6-12 4Gbit/sec FC ports
    that is, 2-4GB/sec
    I can get this with 6 SAS RAID controllers/HBA + 6-12 racks of disks
    $30-60K
    vs over $180K with a SAN
  3. danielreber New Member

    I found the information athttp://www.sql-server-performance.com/jc_system_storage_configuration.asp . Below is what I was looking for.

    Fiber Channel / SAN Storage

    Fiber channel was designed for the high-end storage market. For some reason, the FC/SAN vendors did not continue to increase FC bandwidth from 2Gbit/sec to 4Gbit/sec in an expeditious time frame. In the mean time, U320 SCSI is being replaced by 3Gbit/sec SAS and even the original SATA 1.5Gbit/sec is being replaced with an improved 3Gbit/sec version. One might argue that FC at 4Gbit/sec, finally becoming available after a long pause since the 2Gbit/sec launch, is now faster than SAS and SATA. However the SAS and SATA 3Gbit/sec links are meant to connect 1-4 disk drives, while SAN system intend to cram 14-28 disks over a single 2-4Gbit/sec link, severely constraining sequential bandwidth.


    Thanks


    Dan
  4. JWSQL New Member

    quote:Originally posted by danielreber


    Fiber channel was designed for the high-end storage market.

    As joechang has repeatedly said, a SAN is not going to you the absolute highest performance possible.

    But....... there's more than pure performance considerations in a big production environment. A "SAN" is basically a fancy computer in front of a bunch of disks. So for an EMC DMX3000, the "DMX" is their shorthand for "Direct Matrix Architecture". And to continue unraveling the acronym, the "Direct Matrix Architecture" is fancy marketing-speak for a computer that tries to intelligently manage i/o across 500 drives. Same basic idea for the big HP StorageWorks XP frames or IBM Shark disk frames.

    So why in the world would you insert an extra "computer" in between your 500 drives and your host server?! Why not just do direct attached disks if it's faster?! Well, what that extra "computer" buys you is flexibilty:


    • redundant copies to another separate frame (synchronous or async)
    • redundant point-in-time copies over a WAN
    • redundant copies to refresh a dev/qa/training server
    • split-mirror backups
    • assignment of LUNs to several hosts
    • configure consistency snapshot groups between separate frames (helpful for complex disaster recovery)
    • self-monitoring health checks and phoning home for vendor's technician to swap out defective drives

    Again, keep in mind that all the above happens without using the cpu of the HOST computer. And also keep in mind that the above tricks happen without installing software volume managers (e.g. Veritas) or configuring SQL Server log ship mirroring. Any debate about the premium prices for the above "benefits" are irrelevant to the discussion. You either want/need them or you don't.

    So looking at SAN as a handicap to performance can be an accurate statement. Accurate, yes, but incomplete in describing the big picture of all the things you might want to do with enterprise storage. Choosing SAN vs direct-attach is not about performance---it's really about tradeoffs: price/performance/availability/support/flexibility/etc.

    Different situations require different storage architectures. For example, Google's massive 50000+ disk cluster doesn't use a SAN and it also doesn't use joechang's direct-attach RAID recommendations. Intelligence and deliberate design trumps any SAN/RAID/whatever. Google architects ignored all the existing/typical commercial offerings and designed something unique for their purposes.


    (btw, I realize "SAN" also means including the FC fabric switch topology but I think "SAN" in this context simply means the big disk frames.)
  5. joechang New Member

    these features are nice
    but because the middle layer is on a obsolete computer system,
    so is your entire solution
    what is the point of nice features (beverage cup holders, surround sound) if the engine is severely underpowered

    now a mid-range san is really just built on top of a standard computer system,
    as opposed to high-end which is typically a proprietary solution

    I do not know what the underlying platform to the current Clarrion CX3 line is,
    if it is not either the Intel 5000P or the AMD with DDR2, or equivalent
    then its obsolete

    even then, the CX3 line implements 8 x 4Gbit/s FC ports to hosts, 8 to storage
    (or 4+4 per SP)
    each 4Gbit/s can probably drive 390MB/s between host and SAN cache
    or 330MB/s between host and disk
    for a probable total of 2.7GB/sec
    but how many disks will it take to get that performance in SQL table scans?
    and how is each disk amortizing support structure?
    what is the point if the SAN can deliver but at such high cost the user elects for a crippled configuration?

    fundamentally, a critical DB needs brute force storage performance
    when a report does a table scan, you need to be able to power through this quickly
    for a high row count loop, SQL will dump a huge load into the disk queue
    if the SAN cannot handle high-queue operation, your users will know

    8 x4Gbs FC is more or less equivalent to 11 SAS ports, or 3 x4 connectors
    if i built a storage system on the 5000P chipset
    I could configured 6 x4 PCI-E slots
    each PCI-E slot could have 2 x4 SAS ports (1 SAS port = 3Gbit/s)

    now 2 x4 SAS exceeds the throughput of 1x4 PCI-E
    but most of my traffic is unidirectional,
    so each PCI-E adapter would have 1 port to host, and 1 to storage
    giving me 6x4 to host, and 6x4 to storage

    thats a lot more than 8 4Gbs FC ports

    of course, we need to find out what the 5000P chipset could actually sustain

    if built on the AMD Opteron + nVidia PCI-E components
    I could have a 2 socket with 10 x4 PCI-E ports
    (also pending how much IO this platform could sustain)
  6. bytehd New Member

    Google architects ignored all the existing/typical commercial offerings and designed something unique for their purposes.


    but shouldnt the Google arch be the model for ALL ad-hoc query systems going forward?

    if not, then its not that fast.
    if so, then it is that fast
  7. joechang New Member

    if you are building a 50k+ disk array, and have google resources
    you could build your own storage system

    still, google is not doing any of its own silicon design,
    so they are buying stock parts, ie, disks, disk controllers, computer systems etc
    presumably, they are not foolish enough to buy over priced obsolete hw

    consider that mid-range sans are just standard computers
    if attach disks and controllers to a computer, you could call it a san or direct attach,
    but if the system is not serving disk partitions, its not a san,
    and if the main app does not run local, its not direct attach?

    since google has their own search engine, not a commercial database,
    they do not need a san,

    it would be nice to have hard information
    but if i had to guess
    it would a series of computers with storage attached (ie, direct attach), with search capabilities
    but the main app resides elsewhere
    so you could call it a storage processing unit

    since the rest of other are using commercial database engines
    we cannot use an "intelligent" storage unit, we need standard storage units

Share This Page