When using ADO.NET, what are some tips that they can follow in order to get the best performance out of their application and SQL Server?
It is difficult to mention here all possible answers to this question, but I could mention a few important ones.
Use the SQL Server .NET Data Provider to connect to SQL Server. This simple advice will provide a tremendous performance boost. If you need connecting to other data sources, you could be tempted to use a more general provider, such as the OLE DB .NET Data Provider, but what you’ll gain in flexibility you’ll loose in performance.
Try to use the ExecuteNonQuery method as much as possible, with SQLCommand objects, as this is the most efficient way to execute queries from ADO.NET. Use output parameters with SQLCommand objects if you need to retrieve just a few values, or a single data row, instead of using more expensive techniques, such as a SQLDataAdapter, a SQLDataReader, or a strongly typed DataSet.
Consider a DataSet only as a MINI-database, defined as a TINY collection of related data that serves a particular purpose. If any command that fills a dataset doesn’t have a WHERE clause, think twice about it. Data will need to traverse the network from the data server to the client, and there’s always limited bandwidth available. And the more data you retrieve, the more overhead for your database server, and the more concurrency problems you might experience.
“Get late, release early” or “Connects as late as possible, read as fast as possible, and disconnect as soon as possible.”
Can you tell us a little bit about how the new native access to SQL Server, via TDS, benefits performance? Why wasn’t this option available in previous versions of ADO?
Direct TDS access was available in previous data access techniques, in both ODBC and OLE DB. The OLE DB SQL Server Provider and the ODBC SQL Server Driver were both libraries designed to communicate natively with SQL Server via TDS, because this is the way that SQL Server uses natively to communicate with the outside world. The problem was that you needed to program directly the OLE DB Provider for SQL Server (SQLOLEDB) or the ODBC Driver, and they were designed by C++ programmers for C++ programmers. Using RDO or ADO was easier, but it represented an extra layer, and that meant more overhead and lower performance. Still, many programmers prefer an easier interface (ADO) over performance (native SQLOLEDB).
Also, the new provider runs in managed code, so there is better code control, fewer memory leaks, and eventually better error checking.
Another advantage of the new provider is the fact that the data access developer team is part of the same group as the SQL Server developer team, at last, so both development teams have a better understanding on how to optimize access to SQL Server from the client side.
For anyone curious about the different way that the new provider works, I’d recommend using Profiler to trace what actually arrives to SQL Server and try a few common things, with both ADODB and the SQL Server .NET Data Provider, and you will see a more natural translation of commands into Transact-SQL, with not many surprises for a DBA.
How does connection pooling work under ADO.NET? What are its pros and cons?
Connection pooling works in ADO.NET in a very similar way as it did in OLE DB. Using connection pooling represents cutting dramatically the overhead required to create and destroy connections in SQL Server. Each different connection string, providing the security context doesn’t change, creates a new pool of connections, and a .NET application maintains this pool alive when a connection is disconnected from the client side, maintaining this connection connected to SQL Server for further use.
This technique doesn’t represent many problems, but programmers need to be extra careful with termination code. Any temporary object or outstanding transaction will remain there alive for the new user of this particular connection, potentially producing unexpected results. So make sure you drop any temporary objects you create, and close properly all transactions before disconnecting from ADO.NET, because this doesn’t represent necessarily that SQL Server will close this connection.
Connection pooling provides better scalability to your applications when many different connections to SQL Server must be established, but not many simultaneously. Typical examples are web applications. In this case, trying to use connection pooling represents some extra security challenges. Using SQL Server authentication or Windows authentication will represent using a different pool of connections for each combination of connection string and security context, and this will make using connection pooling almost useless. The whole idea of using connection pooling is to have many chances that when you request a new connection, you could retrieve it directly from the pool, without asking SQL Server to create a new connection for you.
A solution to this problem could be providing some kind of user-defined authentication that will result in access to SQL Server through a few pre-defined roles/logins, trying to limit the number of connection pools, and maximizing the reuse of open connections.
One thing to consider is that using connection pooling is incompatible with using SQL Server application roles.
If your intention is using connection pooling in components that will run in the middle tier, it would be better to leverage the capabilities of COM+ object pooling which will provide a similar effect than connection pooling from the database access point of view.
A final comment about connection pooling, to clarify some common misconception: connection pooling has nothing to do with SQL Server. In other words, SQL Server doesn’t know anything about the existence of any pool of connections. SQL Server sees connections, pulled or not, full stop.