Shared Datasets in SQL Server 2008 R2
In the spatial data series we looked at free-form text based address data and worked towards developing a Geospatial Dashboard . In this artilce we will look at using one of the new enhancements introduced in SQL Sever 2008 R2 – Shared Datasets. Before we discussing the advanatages and applications of Shared Datasets, it is helpful to understand a practial scenario where it would really come in useful.
Why Use Shared Datasets ?
Our BI-Satellite application performed the below tasks:
1) Used data from a SQL Server table, read this data in SQL Server Integration Services (SSIS) and geocoded it using Bing Maps web services.
2) Stored the data in SQL Server Spatial datatypes.
3) We used the dataset in SQL Server Reporting Services (SSRS) and developed a report using the Map control.
4) Finally, made the data in our SQL Server table more detailed and developed a scaled down version of a Geospatial Dashboard.
To keep the focus on the technical study of the geospatial data, we kept the data relatively simple. Consider this data as the regional sales data from a company. When someone would like to view the report of sales of a salesperson , or if the senior management of the company would like to check the details of the performance of all the salesmen of the company it would be mandatory that either the geocode details of the salespeople’s locations are stored in the database whenever a new record is entered or they are updated each time a location value changes.
Adding to this complexity, consider a scenario that departmental stores in some regions have shifted from one place to another. So effectively their geocode values would also change. In these cases, one would be required to maintain a batch process where each change in address would continuously require updating.
But step 2 and step 3 can be eliminated by a solution integrating SSIS and SSRS. We can use a SSIS data processing extension from SSRS, and geocode values on a “on demand” basis. SSIS packages as explained in Part I would geocode values, and this package can be used as a data source in SSRS using SSIS data processing extension. Explanation of implementation details of the same is out of scope of this article, so we would take it as granted that we have implemented this solution.
Now when we use such data source and create a dataset, it would be easier to implement for theIT staff but the same cannot be expected from business users who would need the same data for use in their reports using Reports Builder. As a solution we can opt to make the data source Shared. But when there is a complex process involved in extracting a dataset, one would also want to share the same to make it reusable and easy to use for business users. In our case the dataset cannot be created using a SQL Query as our data source type is the SSIS Data Processing Extension. So we need to create the dataset in a way that can be shared by multiple reports and business users without requiring them to get into the details of how to extract this data.
How to use Shared Datasets ?
The question of “Why” we would want to use shared datasets is now hopefully clear, but the question of “How” we would used shared datasets is yet to be answered and “What” are the advantages of the same is yet to be seen. And we would look at the same by implementing it with our BI App. As discussion of SSIS Data Processing Extension is out of the scope, we would use the same “Address” table in the normal way. Follow the steps below to implement the solution:
1) The first point that should be made is that shared datasets can only use a shared data source.
2) Right-click on the shared data sources folder in the solution explorer pane, and select “Add New Data Source”. Create a connection to the database which has your “Address” table and name it something relevant. After you have created your shared data source, you would find a .rds file listed in this folder.
3) Now add a new shared data set by right-clicking the Shared Datasets directory and selecting “Add New Dataset”. You should be able to see the dialog as shown in the below screenshot. Select data from Address table and name it “FamousBuildingsSharedDS”. After that a .rsd file would get created, which would be listed in this folder.
4) Deploy this data source and dataset.
5) Open Reports Builder and create a new blank report. On the reports data pane, right-click Datasets directory and select “Add Dataset”. Select “Use a shared dataset” option and click on the browse button. Navigate to the directory where your shared dataset would have got deployed (normally with default deployment settings in native installation mode, you would find it in datasets directory). You should be able to find the dataset we just deployed in step 2 as shown in the screenshot below. Use this dataset and name it “MyDS”.
6) Now just to check that this data is accessible, add a table to the report. Select a few fields from this dataset and your report should look like the below screenshot.
Advantages of using Shared Datasets
1) The first and biggest advantage in my view is reusability. Shared datasets can be used by multiple reports, and embedded datasets are limited to the scope of report in which it is defined. Also as we discussed in this article, if you have a specific complex or a generalized dataset, and the same dataset or it’s subset is used by multiple reports, it would be more efficient to implement a generic version of the dataset by using shared dataset feature.
2) As multiple reports are using the same dataset definition, this makes the definition more centralized and easier to maintain. Changing the definition of the dataset at a single location would cascade changes to all the reports using this dataset. This is an advantage, but it should be used with caution.
3) By separating the dataset element from the definition of the report, it becomes easy to individually configure datasets on a broad level. When datasets are consumed in reports, they can be tuned as per the needs of report like selection of fields can be modified, more filter parameters can be added, datasets can be cached separately from the report etc.. But the reverse is not possible, which means that individual reports consuming a shared dataset cannot override the definition of the shared dataset. In a way, this can be seen as a contract implementation by reports using this dataset. Once you have a report using this shared dataset, the scope defined by the shared dataset is guranteed to be available to consumers of this dataset. This is more desireable from an architectural standpoint.
Shared Datasets is one of the new features that is available in SQL Server 2008 R2 version. Use this feature to increase reusability of your data logic and practice a more robust architecture design.