SQL Server Performance

RAID & filegroup placement

Discussion in 'Performance Tuning for Hardware Configurations' started by mikewhite, Jun 30, 2006.

  1. mikewhite New Member

    Hello all,

    We are currently building a Data warehouse on SQL Server 2005 x64 on Windows Server 2003 (x64). The database is loaded nightly via an import routine from an Oracle Rdb Database on a VMS Server. The VMS server uses RAID 0+1 (striped & mirrored). The structure of both databases is identical.

    The actual Rdb database contains 180 tables which are split over 250 files (equivalent to filegroups in MS SQL parlance) - we do this to place indexed and non-indexed information on different portions of the disk to optimise spindle movement etc.

    In terms of the SQL/Windows side of things how do we best configure the disks to ensure optimum read access. We obviously want redundancy but the system needs to be optimized for read access.

    We have been speaking to various technical people and the general feeling is that we pretty much employ the same config as we have on the VMS server, i.e. RAID0+1 with a hardware RAID controller. What are peoples thoughts on splitting table data and index data into separate filegroups and is this possible or necessary in SQL 2005?

    For example, a tables called Invoices with two indexes (idx1 & idx2) would have the following physical files.

    d:datainvoices_dat.mdf
    e:datainvoices_idx1.mdf
    f:datainvoices_idx2.mdf
    g:datainvoices_log.ldf


    Please help as we are unsure as to whether the efficiency gains of the file configuration in Rdb will be lost in the Windows environment.

    Many thanks
    Mike
  2. FrankKalis Moderator

  3. mikewhite New Member

    The database contains 65,000,000 rows on the VMS server, 180 tables
    with 258 indicies. The actual file count I gave in the original post
    is incorrect, it's more like 500.

    We have "budgeted" for 450gb worth of disk space on the
    windows server but expect the database could physically live in
    50gb of space.

    Probably getting slightly ahead of ourselves but we would like to
    introduce date range horizontal partitioning on some of the larger
    (+5million row) tables. Basically we are looking to do everything
    we possibly can to make this db fly.

    Also worth mentioning is that the 64bit server we have is currently on
    rental whilst we evaluate. We expect the server that we buy would have at least 14gb
    RAM and four Opteron CPU's. We would assign three CPU's to SQL and one to windows.

    Thanks
    Mike





  4. FrankKalis Moderator

    Typically when databases are below 1 TB you can live very well with just one filegroup, but I'm sure your subject will get Joe Chang's attention and then you'll get more detailed informations.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Heute schon gebloggt?http://www.insidesql.de/blogs
  5. mikewhite New Member


    We have SQL 2000 databases here at the moment and they all use just one file group.
    As you can tell we are not MS SQL experts so we just don't understand how to
    tune for performance if you can't place data to a particular spindle.

    Might be we are coming at this in a slightly old fashioned way.

    Cheers
    Mike
  6. Luis Martin Moderator

    Wait for Joe.
    Midtime you can read in this forum (hardware) similars problems.



    Luis Martin
    Moderator
    SQL-Server-Performance.com

    All in Love is Fair
    Stevie Wonder


    All postings are provided “AS IS” with no warranties for accuracy.



  7. joechang New Member

    i don't think you should wait for me,
    i am currently in Costa Rica with iffy internet access

    personnally, i see people discussing strategy of multiple filegroups, but not the underlying disk configuration, which i regard as more important

    i will assume you are on s2k, not s2k5,
    the max disk transfer rate you will get on a s2k table scan is 700+MB/sec, much more on s2k5, i have seen as high as 15GB/sec

    so for s2k, first priority is to get the data portion spread across disks to get 800MB/sec
    for this, i suggest 12 36GB 15K SAS drives on 1 PCI-E adapter, or 2 SAS PCI-X adapters
    above applies to direct attach storage, if you really want a SAN, figure 10MB/sec per disk or 80 disks

    if you do not do the above, everything else is a waste of time
    if you do, i like the strategy used by HP's TPCH (a datawarehouse test) benchmark team

    on the above disk array, create one or more partitions for the main data file group using the first partition on each disk array
    then create a second partition for the index filegroups
    a third partition for the tempdb data files.
    a fourth partiton for a load file group, where the data is first loaded, then create a clustered index to move it to the first file group, which is the fastest portion of the disk

    logs can reside on 1 pair of disks, but i have considered that for DW, the logs can be spread across part of the data disk as well

    read the HP Itanium tpch full disclosre for the details
  8. merrillaldrich New Member

  9. joechang New Member

    the general pattern with MS papers is that they reflect an excellent source of information on how SQL & Windows interacts with hardware, considering they source code

    and that they are reasonably informed on technology, the writers have access to high-end hardware, but they do not live and breath hardware performancce, and cannot provide precise performance information
  10. mikewhite New Member

    Many thanks for your help chaps.

    The first thing to say is that we are running SQL 2005 64 bit. Does this mean that these configuration remain the same ?

    Also we are using MSDN Enterprise Edition , but are looking at buting Standrad Edition - I am not sure that partitioning is an option for this ? have you any ideas on this ?

    Many thanks

    Mike
  11. merrillaldrich New Member

    Partitioning is only available in the Enterprise edition, not standard.
  12. mikewhite New Member

    Hi there

    I have to say that I am unsure what to do now. A database this size reportedly runs ok using one filegroup. I would just like torestate the situation and reiterate the question.

    note we ARE using SQL 2005 x64.

    We are currently building a Data warehouse on SQL Server 2005 x64 on Windows Server 2003 (x64). The database is loaded nightly via an import routine from an Oracle Rdb Database on a VMS Server. The VMS server uses RAID 0+1 (striped & mirrored). The structure of both databases is identical.

    The actual Rdb database contains 180 tables which are split over 500 files (equivalent to filegroups in MS SQL parlance) - we do this to place indexed and non-indexed information on different portions of the disk to optimise spindle movement etc. The database contains 65,000,000 rows

    In terms of the SQL/Windows side of things how do we best configure the disks to ensure optimum read access ? We obviously want redundancy but the system needs to be optimized for read access.

    We have been speaking to various technical people and the general feeling is that we pretty much employ the same config as we have on the VMS server, i.e. RAID0+1 with a hardware RAID controller. What are peoples thoughts on splitting table data and index data into separate filegroups and is this possible or necessary in SQL 2005?

    For example, a tables called Invoices with two indexes (idx1 & idx2) would have the following physical files.

    d:datainvoices_dat.mdf
    e:datainvoices_idx1.mdf
    f:datainvoices_idx2.mdf
    g:datainvoices_log.ldf

    i. Can this be done on SQL Server 2005 ie split tables/indexes ?
    ii.Will we be using the benefis gained in RDB by putting it into a Windows environment with the one file group.
    iii. Are we worrying too much about this in light of what Joes says - altho he assumed we were using 32bit SQL 2000.

    I hope this makes things clearer.

    Thanks

    Mike



  13. joechang New Member

    one of the reasons to make multiple filegroups is to place the very largest table on its own file group, its indexes on another
    possible the same for another very large table
    then all smaller tables on its own filegroup

    all of this is so you do not have to defrag the very big tables, typically 1TB or more

    another reason is if you had a DW, where you load up 1 month or quarter at a time, then you only need to backup the most recent load, instead of the whole db.

    the oracle practice probably stemmed from the days before there were hardware raid controllers, and the unix 2gb filesystem limit
    any body who still believes this is relevent today probably has not learned anything in the last 10 years
  14. sc_simsl New Member

    A suggestion on the RAID - rather than RAID 0+1, use RAID 1+0 - it gives better redundancy and recoverability.

Share This Page