Building Cubes with SQL Server 2005 Analysis Services
I admit that when it comes to SQL Server Analysis Services (SSAS), I found plenty of reasons to not get started. I have enough to do on a daily basis and I was just not very interested in learning about how to build cubes. Also, there is not much of a demand for cubes where I currently work, so there was no real incentive for me to spend a lot of time learning something new.
Well, things changed for me a few weeks back, as I was given the opportunity to attend a training class on SSAS. Previously, my only experience with SSAS was the time I tried to create a new Analysis Services project in the Business Intelligence Development Studio (BIDS). I struggled my way through and could not figure out how all the pieces fit, despite having a wealth of information from MSDN at my fingertips. I looked to this class to give me enough of a foundation to be able to at least build and understand cubes, if even to a limited extent.
By the time the class was done I found myself more comfortable working with the project designer in BIDS, how to get started, how to configure, and how to browse cubes. Not only was I more comfortable, but I now find myself going out of my way to experiment with dimensions, facts, and everything else related to cubes and SSAS in general. This article will detail the steps involved and assumes that you have the AdventureWorksDW database available as a data source, you have access to BIDS, and that you have an instance of SSAS running for you to deploy your cube. For details on how to install the AdventureWorksDW database, please see: http://msdn2.microsoft.com/en-us/library/ms143804.aspxDefinitions
Before we dive into building a cube, we need to review some basic definitions. This will help to ensure that at the end you not only have a working cube but an idea of what the cube represents, as well as how to use it. The first two items we will define are Facts and Dimensions.
In the AdventureWorksDW database you will find tables prefixed with the letters ‘Dim’ and ‘Fact’. Ideally, any time you build something to be called a data warehouse you would be building your tables with these two concepts in mind. But what are they? Well, the simplest explanation is to think of them as nouns and verbs, and try not to reflect back to those frustrating hours spent in grammar class. Instead, try to forge ahead with a “Schoolhouse Rock” version of grammar by looking at an example of each.
A dimension is commonly thought of as your noun or subject. It represents an object or a thing that either does something or has something done to it. It may help to also think of a dimension as something that can exist independently of events. An employee, a customer, and a product are all examples of dimensions. When you see a dimension built in the project below, you will find that it has things called attributes, which are simply another way to say they are columns from the underlying data source.
A fact, however, is your verb. It is the event that takes place against a dimension. An example of a fact would be the sale of a product. The sale is the fact, and the product is the dimension. Now, going forward you should be able to understand the fundamental difference between facts and dimensions. When outlining what is needed to build a cube, you should be able to answer basic questions regarding dimensions and facts. And for those of you that did well in grammar class, you could probably diagram a few sentences and build yourself a well structured data warehouse.
Later, we will be able to create a visual representation of facts and dimensions, and how they come together to form what is called a schema. For now, let us press on with two more important definitions: measures and hierarchies.
Your fact table should contain numerical data only, and not contain the descriptive information that would be found in a dimension table. This numerical data is then defined as a measure. Think of a fact table as an entity that contains rows based upon events, such that you would perform aggregate calculations on those events. For example, a fact table could contain one row for each sale made. Later on you will browse through some measures, and this will help you to understand how they were built and what they are to be used for.
Hierarchies are defined inside of dimensions, and are created based upon existing primary-foreign key relationships. Hierarchies allow for one to “drill-through” a cube. Think of an example involving customers, address, countries, etc. You could start browsing a cube based upon countries where you customers reside, looking at totals sales by country. Then, you could drill down to view specific states within the United States, then specific addresses, then even a specific customer name. Each of those levels is defined as an hierarchy.
For more in depth knowledge of data warehousing, please consult the following book http://www.amazon.com/Microsoft-Data-Warehouse-Toolkit-Microsoft-Business-Intelligence/dp/0471267155/ref=pd_bbs_1/102-0309797-5250571?ie=UTF8&s=books&qid=1185241554&sr=8-1.
First thing you will need is to create a new Analysis Services project in BIDS.
Once that is done, you will need to define a data source next. Inside the Solution Explorer you will see the project you have just created, and directly under that project you should see a folder named ‘Data Sources’ (Figure 2)
Right-click and select ‘New Data Source’. You will be prompted with the Data Source Wizard, select ‘Next’ and you should see the following (Figure 3):
Select ‘New’, and define a connection. Here, I will connect to an existing AdventureWorksDW database (screen print deliberately not shown), then click ‘Next’.
Select ‘Default’ for the Impersonation Information settings. If you are not able to connect with default permissions, you will need to get in touch with whoever is administering your SSAS installation and make certain you agree as to what selection you should be making in the above screen. Another click of ‘Next’, then name your Data Source, and then select ‘Finish’.
You should now see the following in your Solution Explorer (Figure 5):
Now, you may have noticed that we are using the sample AdventureWorksDW database, as opposed to the AdventureWorks database. Why? Well, because the AdventureWorksDW database has already been built with nicely defined dimension and fact tables, complete with pristine data. Such entities are crucial to the building of cubes, and the AdventureWorksDW database makes things easier for this walkthrough.
However, in real world scenarios it is unlikely you will be handed a pristine data warehouse to work with when building cubes. The options you have at that point are outside the scope of this article, but could include the building of views against existing source data. I recommend any book by Ralph Kimball on the subject of data warehousing to help you along in this area. But for now, we will just continue with our building of a cube against a nice data source.