SQL Server Performance Forum – Threads Archive

Need ideas for performance

Our daily duties require us to "process" just under a million new records per day. We receive them, import them into SQL Server 2000, populate columns of data based on substrings of what we originally received, previously populated columns, etc. At the end of the day, we’ve got say 1M processed records. That evening, we move those records to a monthly table on another server, so tomorrows 1M records have a clean place to get processed. We also copy those records to a data warehouse (used for reporting via a website to customers). This happens every workday throughout the month. At the end of the month, we’ve got about 30M records in our monthly table. The table size is typically around 60G (including indexes needed for invoicing the data, etc.). We also have summary records based on that information we use for reporting, website queries, etc. Some of these may be duplicated in the data warehouse database (on it’s own server). Unfortunately, sometimes records have to be reprocessed. This means unsummarizing the details for that record from numerous summary tables, moving the record back to the daily table for reprocessing, etc. Over time, some of the summary tables have gotten out of sync causing IT to produce conflicting reports to financial/management departments. In a more perfect world, we’d like to have one set of data. The data warehouse hit by customers, the tables used for invoicing, the tables used for querying for ad-hoc reports, etc. would all hit one set of data. Unfortunately, even quad processor servers with 16 15K RPM drives running RAID 5 don’t offer the performance required to do this. One ad-hoc query could bring the server to it’s knees if table scans ensue. Is a SAN in our future? Can the SANs from EMC, IBM, HP/Compaq do what we need with their "flash copies", etc.? I’m really looking for suggestions here, so if you’ve got ’em, don’t hold back!

Your best bet is to know your app design and look for join keys/columns and also to run the Index Tuning Wizard. There are a couple of great papers on MSDN about Indexing strategies as well:http://msdn.microsoft.com. And also ensure to maintain the AUTOSTATS on the frequently updated tables for optimum performance. It’s less a matter of size than manageability. You may use separate filegroups to stage filegroup backup, or to avoid mixing extents between tables. But the key design issue is to allocate multiple files per filegroup to allow you to span multiple volumes on a SAN. Multiple filegroups per se is less important. So your best bet is to get a REALLY fast drive/set of striped drives that are optimized for write activity. This should probably be on its own separate controller and drive array for optimal performance. I bet SAN will definetly has the advantage of gaining performance and addressing the performance issues on the sheer size of database. Your best bet is to read the hardware literature of the vendors (EMC, HP, Hitachi, IBM, etc.) and also there are some papers onhttp://www.microsoft.com/technet andhttp://www.microsoft.com/sql/techinfo/administration/2000/rosetta.asp links for your reference. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided â€œAS ISâ€ with no rights for the sake of knowledge sharing.

]]>