Storage Modes in SSAS (MOLAP, ROLAP and HOLAP)

Introduction

There are three standard storage modes (MOLAP, ROLAP and HOLAP) in OLAP applications which affect the performance of OLAP queries and cube processing, storage requirements and also determine storage locations.

SSAS (2005 and 2008) supports not only these three standard storage modes but supports proactive caching, a new feature with SSAS 2005 which enables you to combine the best of both worlds (ROLAP and MOLAP storage) for both frequency of data refresh and OLAP query performance.

In the part 1 of this article, I will start my discussion with an overview of each mode and then in part 2 I will cover the new proactive-caching feature of SSAS.

Basic storage modes

The cube data can be divided into three different types – meta-data, detail data and aggregate data. No matter what storage is used, the meta-data will always be stored on the OLAP server but storage of the detail data and aggregate data will depend on the storage mode you specify.

MOLAP (Multidimensional OLAP)

This is the default and most frequently used storage mode. In this mode when you process the cube, the source data is pulled from the relational store, the required aggregation is then performed and finally the data is stored in the Analysis Services server in a compressed and optimized multidimensional format.

After processing, once the data from the underlying relational database is retrieved there exists no connection to the relational data stores. So if there is any subsequent changes in the relational data after processing that will not reflect in the cube unless the cube is reprocessed and hence it is called offline data-set mode.

Since both the detail and aggregate data are stored locally on the OLAP server, the MOLAP storage mode is very efficient and provides the fastest query performance.

· Pros

o Stores the detail and aggregate data in the OLAP server in a compressed multidimensional format; as a result the cube browsing is fastest in this mode.

o Provides maximum query performance, because all the required data (a copy of the detail data and calculated aggregate data) are stored in the OLAP server itself and there is no need to refer to the underlying relational database.

o All the calculations are pre-generated when the cube is processed and stored locally on the OLAP server hence even the complex calculations, as a part the query result, will be pefromed quickly.

o MOLAP does not need to have a permanent connection to the underlying relational database (only at the time of processing) as it stores the detail and aggregate data in the OLAP server so the data can be viewed even when there is connection to the relational database.

o MOLAP uses compression to store the data on the OLAP server and so has less storage requirements than relational databases for same amount of data. (Note however, that beginning with SQL Server 2008 you can use data compression at relational database level as well).

· Cons

o With MOLAP mode, you need frequent processing to pull refreshed data after last processing resulting in drain on system resources.

o Latency; just after the processing if there is any changes in the relational database it will not be reflected on the OLAP server unless re-processing is performed.

o MOLAP stores a copy of the relational data at OLAP server and so requires additional investment for storage.

o If the data volume is high, the cube processing can take longer, though you can use incremental processing to overcome this.

ROLAP (Relational OLAP)

In comparison with MOLAP, ROLAP does not pull data from the underlying relational database source to the OLAP server but rather both cube detail data and aggregation stay at relational database source. In order to store the calculated aggregation the database server creates additional database objects (indexed views). In other words, the ROLAP mode does not copy the detail data to the OLAP server, and when a query result cannot be obtained from the query cache the created indexed views are accessed to provide the results.

· Pros

o Ability to view the data in near real-time.

o Since ROLAP does not make another copy of data as in case of MOLAP, it has less storage requirements. This is very advantageous for large datasets which are queried infrequently such as historical data.

o In ROLAP mode, the detail data is stored on the underlying relational database, so there is no limitation on data size that ROLAP can support or limited by the data size of relational database. In nutshell, it can even handle huge volumes of data.

· Cons

o Compared to MOLAP or HOLAP the query response is generally slower because everything is stored on relational database and not locally on the OLAP server.

o A permanent connection to the underlying database must be maintained to view the cube data.

Note:

If you use ROLAP storage mode and your relational database is SQL Server, the Analysis Services server may create indexed views for aggregation. However this requires a few prerequisite to be available – for example, the data source must be a table, not a view. The table name must use two part naming convention or it must be qualified with owner/schema name etc. For a complete list of these prerequisites you can refer to the link provided in reference section.

Continues…

MOLAP (Multidimensional OLAP)

ROLAP (Relational OLAP)

Leave a comment Cancel reply