Claytons Data Mining (Part 2)
Fill From Example
The ‘Fill From Example’ tool demonstrates a complete data mining cycle and allows us to predict unknown values. The process that is undertaken is to build a mining model, train the model on sample data to it (hence determining what rules contribute to the target value), and then applying these rules to new data. This extends the concepts of ‘Key Influencers’ and ‘Categories’ that were discussed above.
To demonstrate this example, we use data in the sheet ‘Fill From Example’. This is similar data to that from the prior examples, but, unlike the other exercises, the ‘Purchased Bike’ column has been removed and an additional column ‘High Value Customer’ has been created. Note also that the first 9 customers have been classified as being either high value or not. A typical scenario in which this is applied is where an expert fills in the sample data and this is projected onto the rest of the data (assume that in this scenario the 9 customers were classified by a marketing manager).
Our goal in this data mining operation is to determine whether the other customers are high value or not. To do this the engine determines the underlying pattern in the first 9 rows (that is, the ones that are known) and then applies the rules it learns to each row of the remaining data.
Figure 13 – Fill from Example Scenario
To perform the operation simply clicks the Fill from Example button and select the column that you wish to determine the unknown values for (that is, High Value Customer).
There are two outputs formed generated by this operation. Firstly the column titles and there values that are deemed important for prediction (and there relative importance) is added as an additional sheet to the workbook. This is similar to the ‘Key Influencers’ as discussed above. Secondly, the original data (table) is amended with a prediction column which shows the predicted value based on the rules that has been determined. This is displayed in Figure 10 and Figure 11 below.
Figure 14 – Fill By Example Relative Strength
Figure 15 – Fill by Example Value Prediction