New Features in ADO.NET 2.0

Microsoft’s ADO.NET heralded the introduction of a disconnected mode of data access, enabling data exchange even across process boundaries efficiently. This is in sharp contrast to the earlier data access technologies with only connected data access mode of operation. It should be noted that ADO.NET supports both connected and disconnected mode of data access. The introduction of ADO.NET has come as a boon to the development community with excellent features such as, seamless support for XML and support for connection pooling, to name a few. This article introduces the reader to newly added features to ADO.NET 2.0 and discusses how they can improve the performance, scalability, and maintainability of applications.

Why ADO.NET 2.0?

With a view to mitigate the performance drawbacks of the earlier version of ADO.NET and add more flexibility, Microsoft has introduced more new features in ADO.NET 2.0 — features that have profound differences from ADO.NET 1.1, its earlier counterpart.

The performance of DataSets with high volumes of data has posed performance problems in the earlier version of ADO.NET. With ADO.NET 2.0 however, the indexing engine of the DataTable has been completely re-written to facilitate much faster processing of data. According to MSDN, “The indexing engine for the DataTable has been completely rewritten in ADO.NET 2.0 and scales much better for large datasets. This results in faster basic inserts, updates, and deletes, and therefore faster Fill and Merge operations”. There have been numerous enhancements to the DataSet and the DataTable classes too.

ADO.NET 2.0 — The new features

The salient features of ADO.NET 2.0 are, but are not restricted to:

  • Enhancements to the DataSet and Datatable classes
  • Optimized DataSet Serialization
  • Conversion of a DataReader to a DataSet or a DataTable and vice versa
  • Data Paging
  • Batch Updates — Reduction in database roundtrips
  • Asynchronous Data Access
  • Common Provider Model
  • Bulk Copy

The following sections discuss each of these features in detail.

Enhancements to the DataSet and the DataTable classes

In ADO.NET 2.0, the DataTable supports two methods Load and Save. Note that the DataSet class supports both these methods from the earlier version of ADO.NET. While the Load method can load data from XML into the DataTable instance, the Save method can persist the DataTable instance to a persistence storage media. Further, unlike in ADO.NET 1.1, the DataTable class now supports serialization. It should be noted that in ADO.NET 1.1, only the DataSet class supported serialization. Moreover, the DataTable class in ADO.NET 2.0 contains a method called CreateTableReader that returns a DataTableReader instance that can be used not only to read forward only data but also in a disconnected mode of operation.

Further, the DataTable class in ADO.NET 2.0 is serializable, unlike in ADO.NET 1.1 where one had to store the DataTable instance inside a DataSet instance to make it serializable.

Optimized DataSet Serialization

The DataSet object in ADO.NET is an in-memory representation of disconnected, cached set of data and provides a consistent relational programming model regardless of the data source. When we require sending a DataSet class instance across process boundaries, the DataSet needs to be serialized. But, what is Serialization? Serialization is the process of converting an in-memory object into a serial stream of bytes. Serialization and De-Serialization is mostly used to transport objects or to persist the state of the objects to a persistent storage media (e.g. to a file or database).

Unfortunately, the DataSet class in the earlier version of ADO.NET used to serialize data as XML even if the BinarySerializer was specified. This resulted in slower serialization and an overhead of large sized serialized data. In ADO.NET 2.0 however, DataSet serialization has been improved to a large extent and you can now use the RemotingFormat property of the DataSet class to specify that the data is to be stored directly in binary format. Hence, the DataSet class ADO.NET 2.0, unlike ADO.NET 1.1, supports both XML and Binary Serialization formats. This is shown in the code example below.

BinaryFormatter binaryFormatter = new BinaryFormatter(); FileStream fileStream = new FileStream(“c:\emp.dat”, FileMode.CreateNew); DataSet empDataSet = GetEmployeeDataSet(); //This is a custom method that creates, populates and then returns a DataSet instance. empDataSet.RemotingFormat = SerializationFormat.Binary; //Serialize the employee Data Set instance as binary. In order to serialize the
//same instance as XML, specify SerializationFormat.XML binaryFormatter.Serialize(fileStream,empDataSet); fileStream.Close();

Conversion of a DataReader to DataSet or DataTable and vice-versa

ADO.NET 2.0 allows loading a DataReader object into a DataSet or a DataTable and vice versa. Both the DataSet and the DataTable classes in ADO.NET 2.0 contain the Load method that can be used to load a DataReader instance into a DataSet or a DataTable.

The following piece of code shows how a DataTable can be loaded in a DataReader instance.

string connectionString = ….; //Some connection string SqlConnection sqlConnection = new SqlConnection(connectionString); sqlConnection.Open(); SqlCommand sqlCommand = new SqlCommand(“Select * from Employee”, sqlConnection); SqlDataReader sqlDataReader = sqlCommand.ExecuteReader(CommandBehavior.CloseConnection); DataTable dataTable = new DataTable(“Employee”); dataTable.Load(sqlDataReader);

The GetDataReader method of both the DataSet and the DataTable classes can be used to retrieve a DataReader instance from either a DataSet or a DataTable. If ithe DataSet instance on which the method is called contains multiple DataTable instances, the resultant DataReader would also contain multiple resultsets.


Data Paging

Data Paging is a very powerful feature in ADO.NET. It can be recollected that in the earlier version of ADO.NET we needed to make use of stored procedures for incorporating Data Paging functionality in our applications. Now, with ADO.NET 2.0, it is much simplified with the introduction of the ExecutePageReader method in the SqlDataReader class. The following code snippet illustrates how this feature can be achieved.

string connectionString = ….; //some connection string SqlConnection sqlConnection = new SqlConnection(connectionString); sqlConnection.Open(); SqlCommand sqlCommand = new SqlCommand(“Select * from Employee”, sqlConnection); SqlDataReader sqlDataReader = sqlCommand.ExecutePageReader(CommandBehavior.CloseConnection, 1, 25);

Batch Updates — Reducing Database Roundtrips

Database round trips degrade the performance of applications to a large extent due to increased network traffic. The Batch update of ADO.NET 2.0 promises to improve the performance of applications to a large extent by reducing the number of round trips to the database. In the earlier version of ADO.NET, if we made any changes to the DataSet and then saved the DataSet using the Update method of the DataAdapter class, it made round trips to the database for each modified row in the DataSet. This was a major performance hindrance.

But, how does it know which row or rows have been changed? When we make changes to a DataRow, its RowState changes to reflect the change. Now, for every row that has been changed (depending on whether the RowState property has been changed or not), the DataAdapter communicates with the database in such a type of operation. This is a terrible performance drawback with large volumes of data. In ADO.NET 2.0 however, there is a property known as UpdateBatchSize that can be used to specify a group or batch of rows for a particular hit to the database. It provides the number of rows to be updated in a batch. In other words, the UpdateBatchSize property of the DataAdapter class actually determines the number of changed rows changed rows that send to the database server in a single operation.

Asynchronous Data Access

In the earlier version of ADO.NET, the ExecuteReader, ExecuteScalar and the ExecuteNonQuery methods used to block the current executing thread. However, ADO.NET 2.0 supports asynchronous data access mode.In ADO.NET 2.0, these methods come with Begin and End methods that support asynchronous execution.

The Common Provider Model

In the earlier version of ADO.NET, if we wanted to implement a provider independent Data Access Layer, we had to implement the Factory Design Pattern where a class would have been responsible for returning the specific type of Command,Data Reader, DataAdapter or Connection. In ADO.NET 2.0 we can create provider-independent data access code even without referencing the provider-specific classes using the System.Data.Common namespace that exposes a number of factory classes.

The DbProviderFactory class contains two methods called the GetFactoryClasses method and the Getfactory method. While the former is responsible for retrieving all the providers supported, the later can be used to retrieve the specific provider. Refer to the code snippet below that demonstrates how we can make use of this class to create a sql connection seamlessly.

DbProviderFactory dbProviderFactory = DbProviderFactories.GetFactory(“System.Data.SqlClient”);
DbConnection dbConnection = dbProviderFactory.CreateConnection();

Bulk Copy Feature

In the earlier version of ADO.NET, copying a large volume of data from a source data store to a destination table in SQL database had performance drawbacks due to the repeated database accesses that were required. The SqlBulkCopy feature in ADO.NET 2.0 enables us to copy a large volume of data between a source data store and a destination data table. This class can be used to specify the source and the target data sources for this copy operation. The following code snippet illustrates how this feature can be implemented.

SqlConnection connectionObjSource = new SqlConnection(Conn_str); connectionObjSource.Open(); SqlConnection connectionObjTarget = new SqlConnection(Conn_str1); connectionObjTarget.Open(); SqlCommand sqlCommand = new SqlCommand(“Select * from Employee”, connectionObjSource); SqlDataReader sqlDataReader = sqlCommand.ExecuteReader(CommandBehavior.CloseConnection); DataTable dt = new DataTable(“Employee”); SqlBulkCopy sqlBulkcopy = new SqlBulkCopy(connectionObjTarget); bulkcopy.DestinationTableName = “Employees”; bulkcopy.WriteToServer(sqlDataReader);

References

Please refer to the following links for further reference on this topic:

Conclusion

This article has provided a glimpse at some of the newly added features to ADO.NET 2.0, the new version of ADO.NET technology that ships with Microsoft’s Visual Studio 2005. In future articles, I will discuss how we can make use of each of these powerful features and built a robust, secure, scalable and a high performance application using ADO.NET. I hope that readers will find this article useful and would welcome their comments. Happy reading!

]]>

Leave a comment

Your email address will not be published.