Data Mining with Microsoft Office Excel
Introduction
A key issue in data mining is not only generating the results but also interpreting them. Unfortunately it is often difficult to interpret the results of data mining and many users are turning to Excel and its data mining add-in tool.
Requirements
You will need to have permission to access SQL Server Analysis Services. You will need to have Excel installed (preferably 2007 or later). The Microsoft SQL Server Data Mining Add-ins for Microsoft Office 2007 can be downloaded from:
This package includes two add-ins for Microsoft Office Excel 2007 (Table Analysis Tools and Data Mining Client) and one add-in for Microsoft Office Visio 2007 (Data Mining Templates). In this article, will focus on the Table Analysis Tool for Excel 2007.
Unfortunately, this add-in does not work in a 64 bit environment and Microsoft has even said we won’t get 64bit support version in the near future.
Installation
After the standard dialog box in the installation, you will get the following dialog requesting you to select your installation options.
By default, the Table Analysis Tools for Excel is selected and if you wish, you can select the other two options.
Configuration
In the start menu under Microsoft SQL 2008 Data Mining Add-ins, you will see a short cut named Server Configuration Utility which is the tool you to configure the add-in.
At the first dialog, you need to enter the analysis server name, the authentication method should be Windows Authentication as analysis services does not support SQL Server Authentication.
You need to allow the creation of the temporary mining model in SQL Server Analysis Service(SSAS). You can enable this from SQL Server Analysis Service by selecting the properties of the Analysis Service logging in to the SQL Server Analysis Services from SQL Server Management Studio.
Also, you can configure this from the installation wizard.
In the wizard, the next step is to create a SSAS database or else you can allow temporary mining models in the SSAS database.
With the installation, you will get a sample Excel file named DMAddings_SampleData.xlsx to test with. When you open the excel file, in the Analyze ribbon you will see that available for table analyze:
The below table lists the data mining algorithms used by each analysis option.
Menu Option |
Data Mining Algorithm |
Analyze Key Influencers |
Naïve Bayes |
Detect Categories |
Clustering |
Fill form Example |
Logistic Regression |
Forecast |
Time Series |
Highlight Exceptions |
Clustering |
Scenario Analysis (Goal Seek) |
Logistic Regression |
Scenario Analysis (What If) |
Logistic Regression |
Prediction Calculator |
Logistic Regression |
Shopping Basket Analysis |
Association Rule |
Analyze Key Influences
When you have a data set there can be several attributes. For example, in case of a customer dataset, there can be several attributes like Marital Status, Gender, Income and Occupation etc. Analyze Key Influences examines what are the influencing factors for a given attribute. In the Table Analysis Tools Sample sheet, you can find a table with customer attributes.
Let us assume that we wish to analyse the Occupation column for key factors.
1. First select Analyze Key Influences from the Analyze ribbon.
2. Next select the Occupation column for analyze: