Slowly Changing Dimensions in SQL Server 2005

Type 1 SCD
A Type 1 SCD simply overwrites the original information. In other words, no history is kept. For example, if one customer’s region is changed from one to another, the previous region is overwritten by the new one and the previous data will be lost.

This is the easiest way to handle the SCD problem, since there is no need to keep track of the old information. An obvious disadvantage of this type of SCD is that you are unable to analyse historical data. For example, all the previous sales for a customer will be analysed in the new region and not in the old region. Due to the simplicity of implementing a Type 1 SCD, it is revealed that more than 60% of dimensions fall into the Type 1 category.

Implementation of Type1 SCD
Now let us see how we can implement a Type1 SCD with SQL Server 2005.

First create a SSIS Project by launching BIDS. Then add a new SSIS package to the project. The next step is to create a new connection manager pointing to the relevant database. Then add a data flow task to the control flow and double click the added data flow.

Add an OLEDB source editor and select SQL command as the data access mode and enter the following T-SQL command.

SELECT  cus.CustomerID,
        cus.FirstName,
        cus.LastName,
        cus.MaterialStatus,
        reg.RegionName,
        cat.CategoryName
FROM    staging.tblCategory AS cat
        INNER JOIN staging.tblcustomer AS cus ON cat.ID = cus.CategoryID
        INNER JOIN staging.tblRegion AS reg ON cus.RegionID = reg.ID

The connection manager should look like this:

Next, drag and drop the Slowly Changing Dimension to the data flow and connect it to the output of the oledb source. After dropping it, you will be taken to the SCD wizard. If not, double click the control so that you will be taken to the wizard.

The first screen that you will get is to select the mapping between the source (staging) and the dimension (data warehouse) table.

As there is only one connection available, by default that connection is selected to the connection manager.

Next you need to select the dimension table. In this case it is datawarehouse.dimcustomer.

After that our next task is to map the input and dimensions columns. By default, columns with similar names and data types will be mapped. Out of the existing columns you need to select at least one as the Business key. The Business key is the key which should be used to lookup the tables to verify whether it needs an insert or an update.

You will not be able to map two columns if they have different data types. If you don’t have the same data types you will have to convert them to the data type of the dimension table by using a data conversion data flow transformation.

The next step is to allocate change type to the columns.

In the case of a Type 1 SCD, there are two types of change types available. They are fixed attribute and changing attribute. Fixed attributes are the attributes which do not changes. For example, the first name and last name will not change if not there is typing errors. Changing attributes are the attributes which will overwrite with the existing values.

Next will be following screen.

Continues…

Leave a comment

Your email address will not be published.