Working with Spatial Data Part I – Geocoding Text-Based Spatial Data for Use In SSIS Packages

Address information is one of the most common attributes of data stored in a  business data repository. Examples of such data can be addresses of employees, store locations of retail chains, sales of a product etc. You may  wonder about the last example and how sales of a product can be linked with an address. In most of the retail chains, customers are provided with a loyalty card, on which they collect points for each product purchased. This card contains the address of the customer and so geographic data on sales is collected. This data facilitates data mining using algorithms such as market-basket analysis by geography, and using the same data in  spatial form  one can analyze consumption of any particular product in an area.

The  focus for this article is geocoding of any location data. In simplest definition, geocoding is the process of deriving the position of a location in any standard geographical notation. The most well-known form of the geocoding  is latitude and longitude , by which any position on the earth’s perimeter can be defined.

Spatial (geographical) data analysis for any purpose includes the below steps:

1) Geocoding text based location information.
2) Generating a repository of the same geocoded information for an applications to  consume.
3) Representing this information in an analyzable form, such as on a geographical map.
4) Customizing the presentation layer to make dynamic retrieval of data and formatting the same in a presentable manner.

For this demonstration we will focus on Step 1 – Geocoding location information. 

Environment Setup:

In order to create a demo project for our concept, we need to make three different artifacts available to our project which are as below:

1) Test Data: We need to create some test data which we can be used for geocoding. As we want to apply this concept to data that is found in some real world database systems, we need to create an “Address” table as shown in the below screenshot. I have inserted some data, which are some famous buildings across the globe. I have intentionally left some values NULL in either “AddressLine2″ field or “City” field, so  that it simulates the data in legacy or some on-line transaction processing (OLTP) systems. Many systems allow for entering data in a free text format and we may not always be able to find data precise to the last digit of a zip code. For now, to keep our focus on the concept, we will  keep this data simple and easy to geocode.



2) Geocode processing engine: As we ourselves cannot create a database of all the places across the globe, we need a  service such as Google Maps or  Bing Maps which can take text based location information as an input and return the latitude and longitude of the corresponding location.  These web based applications expose web-services which can be consumed by applications, our purpose in this case is to geocode the text based data we have in our “Address” table.

For the purposes  of this sample application, we will  use the  webservice exposed by Bings Maps. First we will need to subscribe to the service, to do so, navigate to www.bingmapsportal.com, sign up and create a key which will  be your gateway to the webservices exposed by Bing Maps. In the below screenshot, you can see that I have created my application-key pair.



3) Geocode webservice wrapper class: We need to create a client or wrapper class which we can use in our SSIS package, so that we can conveniently consume it’s webservices. The .NET Framework ships with the WSDL.exe tool, which can create a proxy class by taking the webservice’s wsdl (Web Service Description Language) as the input. You need to have any version of the .NET Framework installed, and use this tool with the below command  from the Visual Studio Command prompt:

wsdl /out:C:/geocode.cs http://dev.virtualearth.net/webservices/v1/geocodeservice/geocodeservice.svc?wsdl

Upon  executing this  command, the tool will create a geocode.cs C# class file, which is the proxy class we will  use in our SSIS package (the URL string in the command is used to retrieve the wsdl of the Bing Maps geocode webservice).
 
Continues…

Pages: 1 2 3




Related Articles :

No comments yet... Be the first to leave a reply!

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |