Achieve Spatial Data Support in SSIS

SQL Server 2008 introduced a new category of datatypes known as spatial datatypes which store spatial information. The new spatial datatypes are geography and geometry. SQL Server Management Studio comes with good good support for these spatial data types like the spatial results tab and facilitated access to .NET Common Language Runtime (CLR) functions on these datatypes. SQL Server Integration Services (SSIS) is one of the business intelligence applications which is most  sensitive to datatypes. Even in the R2 version of SQL Server 2008, it can be safely said that there is no real out-of-box support for these spatial data types. The only possible datatype the maps in SSIS to any spatial datatype is DT_IMAGE, which means that any spatial data is treated like a binary large object (blob) data. There may  be cases when spatial data from two different database servers are brought in and one may wish to perform some spatial functions on the these. However ,  if the data is  DT_IMAGE, it is almost impossible to dissect the different pieces of information from this data. Solution :
There is no built-in support for spatial data types in SSIS, which means that there is no datatype which exactly matches the spatial datatype and when any spatial field is read using a source adapter, the datatype that will be mapped to any spatial field is  DT_IMAGE. The solution to this issue is re-building the datatype in SSIS. Script transformation is the only component for rebuilding any spatial datatype, and  we will also examine the reason for this. For this demonstration, we will consider a dataset which has any spatial datatype field, and to keep our focus on the concept, we will  consider the geography datatype. We will read this datatype from a SQL Server table and then extract  information from this datatype. This exercise will demonstrate how to use spatial datatypes in SSIS to the same level as in SQL Server using SSMS. Environmental setup:
For this exercise, will use a table consisting of at least one field of the geography datatype having at least one record. To save time and effort, I will reuse the Address table that we built in the BI-Satellite project, as below: Key Factors in the Design Decision:
Before we start with the package design, there are several points to keep in mind : 1) Every geography datatype has a spatial reference identifier (Srid) associated with it, which will be stored along with the value stored in the field. 2) To rebuild a .NET CLR datatype (i.e. geography) we need to use a transform that is capable of facilitating access to .NET assemblies, which is the Script transform. 3) Almost all of the spatial static or instance .NET CLR functions capable of creating a spatial object take two input parameters: i) Srid and ii) geometrical structure in text, xml, binary or a byte stream format. 4) Microsoft.SQLServer.Types namespace contains classes related to spatial datatypes, and for the geography datatype the class that we require is SqlGeography. 5) Geography Markup Language (GML) is one of the XML based data format standard defined by Open Geospatial Consortium (OGC), and fortunately the geography datatype has built-in functions to export GML data. 6) We need to make sure that when we read data into the package, this should be read in some form of text and not as a DT_IMAGE datatype. If the data is read as DT_IMAGE, then it will be  impossible to cast it to a datatype which can be fed to a .NET CLR function to  create a spatial datatype.

Continues…

Leave a comment

Your email address will not be published.