Using The Data Mining Query Task in SSIS

Using Analysis Services Processing Task & Analysis Services Execute DDL Task in SSIS I discussed how to use the Analysis Services Processing Task to process an “Adventure Works DW Standard Edition” sample database which is available for SQL Server 2008 and we also discussed, how to backup an Analysis Service Database using Analysis Services Execute DDL Task which is available in SQL Server 2005 Integration Services and later versions. In this article I will examine the use of the Data Mining Query Task which is available in SQL Server 2005 and later versions.

If you are new to SQL Server Integration Services then I would recommend you to start with my introductory article SQL Server 2008 Integration Services Tasks.  

Overview of The SQL Server Analysis Services Task

SQL Server Analysis Services Tasks within SSIS can be used to create, modify, delete and process Analysis Services objects. The different types of Analysis Services tasks which are available in SQL Server 2005 Integration Services and later versions are Analysis Services Processing Task, Analysis Services Execute DDL Task and Data Mining Query Tasks.

The Analysis Services Execute DDL Task is very  similar to the Execute SQL Task; however using the Analysis Services Execute DDL Task you can issue Data Definition Language statements against an Analysis Services system. The DDL statements can be used to create cubes, dimensions, KPI’s or any other analytical processing of OLAP objects.

The Analysis Services Processing Task can be used to process analysis services objects such as cubes, dimensions and mining models.

The Data Mining Query Task can be used to run prediction queries based on data mining models built in analysis service. A prediction query creates a prediction for new data by using the mining model chosen. The query which is used to determine the result set is a Data Mining Extensions (DMX) statement. The DMX language is an extension of the TSQL language that provides support for working with mining models. Using the Data Mining Task the results can be written to a table. If a table with the same name specified already exists, the task can create a new table using the same name or it can even overwrite the table content.

Example – Using The Data Mining Query Task

In this example, we will be using the Analysis Services Data Mining Query Task to run a predication query to determine  the customers who are interested in buying bikes. Using the Data Mining Query Task, I will be querying the TM Decision Tree Mining model which is build on Targeted Mailing Mining Structure available in the “Adventure Works DW Standard Edition” sample database for SQL Server 2008 Analysis Services.

Configuring Data Mining Query Task

1.          Create a new SQL Server Integration Services Project and rename the default package DataMiningQueryTask.dtsx

2.          Double click the DataMiningQueryTask.dtsx package to open it in Design mode.

3.          Drag and drop the “Data Mining Query Task” from the toolbox to the Control Flow window.

4.          Double click the Data Mining Query Task to open up the Data Mining Query Task Editor and then navigate to the Mining Model Tab as shown in the snippet below.

5.          In the Mining Model tab of Data Mining Query Task Editor click the New button to open up the Configure ADO.Net Connection Manger window as shown in the snippet below.

6.          In the Configure ADO.Net Connection Manager window click the New… button to open up the Connection Manager Window.

7.          In Connection Manager window, you need to select “Microsoft OLE DB Provider for Analysis Services 10.0” as the Provider under “.Net Providers for OleDb” from the drop down list as shown in the snippet below and click OK to save the changes.

8.          Next, you need to mention the Analysis Server Details, Authentication Mode (choose Windows NT Integrated Security) and also select the Initial Catalog as “Adventure Works DW Standard Edition”. This database comes as a sample analysis service database for SQL Server 2008. You can test the connectivity to analysis services instance by clicking Test Connection. If the connection is successful, then click OK to save changes in the connection manager.

9.          Return to the Configure ADO.Net Connection Manager window where you will be able to see the data connection properties. If you are fine with the configuration,  click OK to save the changes and you will return to the Mining Model tab of the Data Mining Query Task where you need to specify the mining structure and mining model to use.

10.      In the Mining Model tab of the Data Mining Query Task Editor Window choose a suitable Mining Structure and   Mining Model. In this example choose the Mining Structure as the Targeted Mailing from the drop down list and then select  “TM Decision Tree” as the Mining Model as shown in the below snippet to complete the Mining Model tab configuration.  

11.      Click on the Query tab and you will be presented with three child tabs –  Build Query, Parameter Mapping and Result Set.

12. In the Build Query child tab, you can either provide the prediction query or click the Build New Query button to build a prediction query using the GUI. Once you click on Build New Query the New Data Mining Query window will open (as show in the below). Choose the Case Table by clicking Select Case Table…. and choose the DimCustomer  table from the Select Table pop up window.

13.      Next, choose the fields one by one against the respective source in the bottom panel of the New Data Mining Query. You can also provide the conditions in the Criteria/Argument as shown in the below snippet. Finally, set the Source as TM Decision Tree and choose Bike Buyer in the Fields column and also provide the criteria/argument value equal to 1 along with other fields. Click OK to save the configuration in the New Data Mining Query window and to generate the DMX query.

Continues…

Leave a comment

Your email address will not be published.