SQL Server Performance Forum – Threads Archive

SQL striping vs raid striping

I’m in the process of setting up a new SQL server (SS05) for a data mining application (note: primarily heavy sequential access — writes during bulk loads and reads during mining/reporting), and I’m wondering if I should bother striping the raid array, or if I should just do mirroring on the controller and leave the striping to SS05. I have 16 spindles to use (250s) for data, and my database size need is approx 1TB, and I have no problems with max number of volumes, etc. (Controller is a pair of Dell PERC 5s, all 4 ports connected to two split disk racks; an additional separate PERC for the system/log disks.) CPU is dual quad Xeon, so I know I get a lot of benefit from running somewhere around +/-8 SQL stripes for the data files (multiple data files per filegroup); my question is if I would gain (even) more by having the raid controller also stripe, or if I should just leave the controller to do mirroring. I suppose there are four alternatives:
– one volume raid 10 across all 16 spindles, with ~8 files in the volume, all pre-sized to avoid frags
– 8 volumes, each striped raid 10 across all 16 spindles, each with one data file
– 8 volumes each raid 1 on two spindles, each with one data file
– 8 volumes, each striped raid 10 across all 16 spindles, each with ~8 files in the volume [Although I say raid 1/10, I plan on testing raid 5/50 as well; this question — to stripe at the disk controller level in addition to the file group level, or not — seems independent to me of the raid level.] Any thoughts??? (I don’t really see a benefit to the last alternative above, and I am ambivallent about the third alternative, but am having trouble deciding between the first two.) I wish I had time to thoroughly test this, but I don’t so I would really appreciate any advice! (I originally posted this over at StorageReview.net, but was directed to this forum…) Thanks!
-frank

I believe there would be no performance difference between the first two – either way you have 8 files on one RAID 10 array – the fact that the array is divided into different volumes is inconsequential from a performance perspective. Recommendations on option 1 vs option 3 would be interesting to me, but I suspect that with completely random access patterns, you’re better off just letting the controllers sort it out (i.e. option 1) Don’t bother with RAID 5 if you agree with the members of www.baarf.com

quote:Originally posted by mdefehr I believe there would be no performance difference between the first two – either way you have 8 files on one RAID 10 array – the fact that the array is divided into different volumes is inconsequential from a performance perspective. Recommendations on option 1 vs option 3 would be interesting to me, but I suspect that with completely random access patterns, you’re better off just letting the controllers sort it out (i.e. option 1) Don’t bother with RAID 5 if you agree with the members of www.baarf.com

I’m actually leaning towards option 1 now (1 raid 10 volume, ~8 files) myself, thinking that as an extent is only 4k, I really don’t want to stripe by extents alone as they are way too small. I’m also thinking — and this is radical — that maybe for this application, fragmentation is my friend: if I’m building these tables with sequential streaming, and then reading them with sequential streaming, if I pre-allocate the 8 files and avoid frags, then the sequential writes and reads will result in lots of head movement as the drives fetch extents from each file. Perhaps reordering would avoid a lot of this, but still it seems that letting the files grow and frag during the round-robbin process would reult in the data being laid down in an optimal ordering for a streaming read??? I need to think on this!
Thanks,
-frank

Check: http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=16995 Luis Martin
Moderator
SQL-Server-Performance.com All in Love is Fair
Stevie Wonder
All postings are provided â€œAS ISâ€ with no warranties for accuracy.

I would do one of the following 1. One RAID10 across all drives
2. One 2 drive RAID1 for TLOG of your main DB, one RAID10 across 14drives. Bert

]]>