SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Training
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Performance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
SQL Azure
Developer
General DBA
ASP.NET / ADO.NET
SQL Azure

USEFUL SITES :

ASP.NET Tutorials
Windows and SQL Azure Tutorials
Cloud Hosting Magazine
SharePoint Tutorials
Windows Server Help

Write for Us

Share your SQL Server knowledge with others and raise your profile in the community More...
Latest Articles

A High Level Comparison Between Oracle and SQL Server - Part ...
A High Level Comparison Between Oracle and SQL Server - Part ...
A High Level Comparison Between Oracle and SQL Server - Part ...
A High Level Comparison Between Oracle and SQL Server

More     
 
Latest FAQ's

Add Node to A SQL Server failover Cluster failed with invalid ...
SQL Server Destination remote server error
Setting Up Data And Log Files For SQL Server
Will Check Constraints Improve Database Performance?

More     
   
Latest Software Reviews

dbForge Review
Spotlight on ApexSQL Diff - Server-based database comparison tool ...
Spotlight on ApexSQL Data Diff - Server-based database comparison tool ...
Spotlight on ApexSQL Doc 2008

More     

articles >> business intelligence >> SSIS New Features in SQL Server 2008 ...

SSIS New Features in SQL Server 2008 - Part 2

By : Arshad Ali
Feb 11, 2009

This article is part 2 of a 5 part series that explores the SSIS features introduced in SQL Server 2008.  Focussing on Lookup Transformations in SSIS 2008.

Lookup transformations are one of the most widely used transformations in the SSIS; Microsoft have recognised this and thus have improved it better performance. For example:

  • In SQL Server 2005, the Lookup component could only get data from specific OLEDB connections, the cache could be populated only by using a SQL query (now it can be from a cache file). In SQL Server 2008, source data can be used from almost anywhere, such as from text files, raw files, XML files, Excel files, proprietary-format binary files, web services, a record-set, or anything that can be accessed using an ADO.Net provider.
  • SQL Server 2005 reloads the cache every time it is used. In SQL Server 2008, a cache can be used in multiple lookups thus solving the problem where each iteration in a loop needs to repopulate the cache. The cache can be saved to virtual memory or permanent file storage. Therefore within the same package, multiple Lookup components can share the same cache. The cache can also be saved to a file and shared with other packages. If more than one package needs to access the same reference data, it can be loaded in a persistent cache and shared by all the packages, rather than be loaded and unloaded for each package. The cache file format is optimized for speed and access to it can be orders of magnitude faster than reloading the reference dataset from the original relational source. Also SQL Statements can be built  dynamically within an OLE DB Command that can now change at execution-time where the data is stored in the cache. For this purpose, use Cache Transform transformation which writes data from a connected data source in the data flow to a Cache connection manager (only unique rows). Configure the Cache connection manager to save the data to a cache file (.caw) which can be referenced / shared by lookup transformations of other data flow tasks or by other packages as well.
  • In SQL Server 2005, the lookup transformation has only two outputs, one regular output through which matching records pass and another one is error output through which all non-matching records and records with error pass. SQL Server 2008 made a distinction between these two. It now has three outputs, one regular output for “Matching” records; second one for “No Matching” records and third one for records with error.

Note:
If the Lookup transformation is configured to treat the rows without matching entries as errors, the rows are redirected to the error output instead of No Matching output.

  • Aside from the above, other improvements are: miss-cache feature, which saves time by optionally loading into cache the key values that have no matching entries in the reference dataset and can contribute up to a 40% performance improvement in some scenarios, a more intuitive user interface that simplifies the configuration of the Lookup component, in particular the caching options, Optimized I/O routines leading to faster cache loading and lookup operations etc.

Note:

  • If the reference database is remote, or under heavy load, consider using the Cache Connection Manager instead of an OLEDB connection.
  • Once a cache is used (or created) in an SSIS package, it will be kept in memory until the package has finished executing. The cache can be reused across multiple data flows, and shared between multiple lookups in the same data flow. It can also be persisted to disk, and reused across package executions.
  • When the cache is saved to a file, the system loads the cache faster. This improves the performance of the Lookup transformation and the package. Remember, when using a cache file, the data being worked on is not as current as the data in the database.

In summary, with SQL Server 2008 the Lookup transformation has been improved to allow explicit control over the lookup data, a new breed of connection manager viz. Cache Connection Manager has been introduced to store cache to file and share cached lookup data among different components and packages and finally the more intuitive UI for designing Lookup transformation.

Example
The below example uses the same package as discussed in the previous section, however this time using SSIS 2008 with all three outputs of the lookup transformation.

 

Right click on the Lookup transformation and click on Edit, a new more intuitive lookup transformation editor will come up as shown below, it has now five tabs than three tabs in previous version. On General, specify the caching mode, the connection type to use for reference table and how to handle non matching records. On Advance tab, can specify the size of the cache for 32-bit and 64-bit machines (Default value is 25 MB).



So the final result of running the package, 542 rows are flowing from source to the lookup as input, reference table matches with 124 rows which passes through “Match Output”, another 418 rows which could not be matched are passed through “No Match Output” path. Since there are no erroneous records, there are no records passing through “Error Output”.
 


        








C# Help and Tutorials | PHP MySQL Tutorial | Sharepoint Tutorial | Azure Tutorial | Cloud Hosting Magazine | ASP.NET Tutorials | Windows Server Help | Windows Phone Pro | Silverlight Ace | Visual Studio Tutorials | Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | SQL Server Training Videos | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views


              © 2010 Jude O'Kelly. All rights reserved