SQL Server Data Mining with Microsoft Office Excel
3. Next, choose
the columns for analysis by clicking Choose columns to be used for
By default, all columns are selected.
However, you can ignore columns such as ID. If there
are derived columns you should ignore these as well. For example, there can be a bonus
column which is based on a percentage of the salary column.
Press OK for this dialog and click Run for
the other dialog and you will get the following results in a new sheet in the
same Excel file.
The abThe above table indicates that Income range
39,050 – 71,062 is more favored to Skilled Manual Occupation.
The Relative Impact column, which is the last column,
indicates the relative impact to the Occupation. This is a type of rank
with strongest getting a score100 and the weakest a score of 0.
In case you want to find to out the relationship
between selected factors, you can use the table filtering option in Excel. By doing
this you can ignore unnecessary values. For example, if you want to find out the
influencing factors for Age and Income towards Occupation, you can filter the column
field and it will look as below:
Discrimination on Key Influencers
When generating a key influencing report, you can add any
number of discrimination reports which compare how key factors
differentiate between them.
From the following dialog box you can view a report of the discrimination between occupations of Skilled manual and Clerical occupations.
The below shows the discrimination report between Skilled Manual and Clerical.
The final column was added manually and not by the add-in tool. This column is
the sum of column 3 and column 4.
The above report will
list factors that will influence for the selected occupations in descending
order. You can add any number of discrimination reports, but you have to generate
them at the instance and so you cannot add them later.
presented in the discrimination report are often different from those presented
in the main report. Some influencers from the main report are not present in
the discrimination report. Those are important when comparing Skilled manual
and Clerical against all other occupation types.
When working with large sets of data, it is much easier to deal with
a smaller and more manageable sub set of the data which has similar behavior patterns. This
practice is called Customer Segmentation.
The Detect Categories table analysis tools finds natural
groups in the data. It analyses your data, finds the most common combinations
of column values and then defines groups based on these common patterns.
As in the previous case, you can use the Table Analysis Tools
Sample Excel sheet. After selecting the table in the sheet, select Detect
Categories from the Analyze table which will bring up following dialog box:
As in the previous case, you need to select the attributes that
need to be categorized.