Claytons Data Mining (Part 2)
The methods of data mining examined until now generally require many columns as inputs to determine which ones (and their values) are most important in determining the target. Other methods of data mining require only a small number of columns and look for recurring patterns in the data. These methods include link analysis (commonly called shopping basket case analysis) and time series forecasting. Basket case analysis uses a shopping transactions (which includes the products in the purchases) to determine which products are purchased together. Time series forecasting utilize values across a time line to project what the future values will be. Such concepts may have their origins in charting where the chart is extended past known data to predict future values. Also note that in this approach, the values are separated by regular time intervals (for example, monthly time intervals).
The Excel addin includes a forecasting component which estimates future time period values. The input data requires sequential time periods as in the sheet ‘Forecasting’. In this example we project the next five periods. Click the Forecast button, specify the columns to forecast (Sales Amounts) and set the time stamp (time column). You can also specify the number of periods into the future that you wish to project (5 in our case).
See Figure 16 below.
Figure 16 – Forecast Parameters
There are two outputs generated from the forecasting operation. Firstly a worksheet in added to the workbook which shows the projected values as a dashed line (see Figure 17). Secondly, the projected values are added sequentially to the original table (Figure 18). It should be noted that the chart is actually generated from the original table and the forecasts that were appended as part of the data mining operation.
Figure 17 – Forecasting Chart
Figure 18 – Forecasting Values