Tips for Optimizing SQL Server OLAP/Analysis Services: Data Warehouse & Datamart Performance

While in production, set the “read only” database option to true for your data warehouses and datamarts (not cubes). This will turn off locking and greatly speed queries against the data. If you do make your database “read-only, be sure to update the database’s statistics first. This is especially important for SQL 7.0 and 2000 as “Auto Update Statistics” is turned off when a database is set to “read only”. [6.5, 7.0, 2000] Updated 3-15-2005

*****

With your data warehouses and datamarts, which are essentially read-only, having as many indexes as necessary for covering virtually any query is not normally a problem. The only major disadvantage of having lots of indexes in your data warehouse or datamart is that it increases the load time of data into your databases. [6.5, 7.0, 2000, 2005] Updated 3-15-2005

*****

Consider adding a clustered index to any table in a data warehouse that needs to produce sorted results. This way, the data is already pre-sorted (by the clustered index key), saving a lot of time when the query is actually run. This becomes more important as huge numbers of rows are returned from your query. [7.0, 2000, 2005] Updated 3-15-2005

*****

When using the star schema design, at a minimum, you will create a non-clustered index on the primary key of each dimension table and a non-clustered index on each of the related foreign-keys. From there, you can create non-clustered indexes on additional columns that will be queried on frequently. You don’t need to create composite indexes to create covering indexes because SQL Server will use index intersection to do this for you automatically. [7.0, 2000, 2005] Updated 3-15-2005

*****

The SQL Server Profiler and the Index Tuning Wizard (or Database Engine Tuning Advisor for 2005) can be used for tuning indexes for data warehousing and datamarts as well as for OLTP databases. [7.0, 2000, 2005] Updated 3-15-2005

*****

When you create indexes on your data warehouses and datamarts, use a FILLFACTOR of 100 to ensure that the index pages are as full as possible. This reduces unnecessary I/O, speeding up performance. [6.5, 7.0, 2000, 2005] Updated 3-15-2005

*****

Schedule large data imports or exports on your production servers during less busy periods of the day to reduce the impact on your users. [6.5, 7.0, 2000, 2005] Updated 3-15-2005

*****

If you need to periodically perform data loads into your datamart or data warehouse, consider dropping the indexes on the applicable tables before starting the data load. In most cases, it is faster to drop indexes, insert the data, and re-add the indexes, than it is to import bulk data into tables that have pre-existing indexes. You will want to test this suggestions before implementing it to see how much it can, or cannot help. [7.0, 2000, 2005] Updated 1-28-2005

*****

Always select the smallest data type you can when building data warehouses and OLAP cubes. This reduces the amount of storage required, reducing server I/O and boosting performance.

When it comes to choosing between CHAR and VARCHAR data types, use CHAR when the data length is predictable, and use VARCHAR when data length will have a standard deviation of over 2 characters. This is because VARCHAR data types require 2 bytes of overhead, and if the standard deviation of the average length of expected data is less than 2, then the CHAR datatype can be more efficient of storing data.

Also avoid the text and NCHAR and NVARCHAR datatypes. These generally waste valuable space.

While it may seem that selecting the smallest possible datatype for storing data is no big deal, it can be in data warehouses where millions of records are stored, and having one unnecessary byte of data, a million times, can be a big deal. [6.5, 7.0, 2000, 2005] Updated 1-28-2005

Continues…

Leave a comment Cancel reply