SQL Server Data Mining with Microsoft Office Excel

3.       Next, choose the columns for analysis by clicking Choose columns to be used for analysis

By default, all  columns are selected. However, you can ignore columns such as ID. If there are derived columns you should ignore these as well. For example, there can be a bonus column which is based on a  percentage of the salary column. 

Press OK for this dialog and click Run for the other dialog and you will get the following results in a new sheet in the same Excel file.

The abThe above table indicates that Income range 39,050 – 71,062 is more favored to Skilled Manual Occupation.
The Relative Impact column, which is the last column, indicates the relative impact to the Occupation. This is a type of rank with strongest getting a score100 and the  weakest a score of 0.

In case you want to find to out the relationship between selected factors, you can use the table filtering option in Excel. By doing this you can ignore unnecessary values. For example, if you want to find out the influencing factors for Age and Income towards Occupation, you can filter the column field and it will look as below:

Discrimination on Key Influencers

When generating a key influencing report, you can add any number of discrimination reports which compare how key factors differentiate between them.

From the following dialog box you can view a report of the discrimination between occupations of Skilled manual and Clerical occupations.

The below shows the discrimination report between Skilled Manual and Clerical. The final column was added manually and not by the add-in tool. This column is the sum of column 3 and column 4.

The above report will list factors that will influence for the selected occupations in descending order. You can add any number of discrimination reports, but you have to generate them at the instance and so you cannot add them later.

The influencers presented in the discrimination report are often different from those presented in the main report. Some influencers from the main report are not present in the discrimination report. Those are important when comparing Skilled manual and Clerical against all other occupation types.

Detect Categories

When working with large sets of data, it  is much easier to deal with a smaller and more manageable sub set of the data which has similar behavior patterns. This practice is called Customer Segmentation.  

The Detect Categories table analysis tools finds natural groups in the data. It analyses your data, finds the most common combinations of column values and then defines groups based on these common patterns.  

As in the previous case, you can use the Table Analysis Tools Sample Excel sheet. After selecting the table in the sheet, select Detect Categories from the Analyze table which will bring up following dialog box:

As in the previous case, you need to select the attributes that need to be categorized.


Pages: 1 2 3


No comments yet... Be the first to leave a reply!