An Exclusive Interview with Author Fernando Guerrero On How to Get the Most of ADO.NET and SQL Server
In this exclusive interview, Fernando shares some of his knowledge and experience on how to get the most out of ADO.NET and SQL Server 2000.
Developers who write applications that use SQL Server as their back-end database are always interested in getting the best performance out of SQL Server. What does ADO.NET offer new to developers who want to get the best performance out of SQL Server?
The SQL Server .NET Data Provider uses TDS (Tabular Data Stream) to connect directly to SQL Server. This is the native packet format that SQL Server uses, and it doesn’t require connecting through other layers, such as ODBC or OLEDB. Programmers used to believe that native access to ODBC from C++ was the fastest way to connect to SQL Server, now there is a new kid in town: the SQL Server .NET Data Provider.
I would recommend learning as many tricks as possible about the SQLCommand class and how to use stored procedures and user-defined functions through SQLCommand objects. Mastering SQLParameter utilization and using the SQLDataReader class will make your application rock.
When using ADO.NET, what will developers have to “unlearn” and “learn new” when making the transition from ADO to ADO.NET?
I believe that the most difficult concept to learn is the difference between the connected and disconnected layers. From an ADODB point of view, the Recordset object was something connected to a database at some point, whereas the DataSet doesn’t know anything about where the data comes from. It is so different that a Dataset doesn’t have any property or method to connect to a database. For developers used to working with the disconnected model, it won’t be a problem to adapt themselves to the new rules. For developers who rely on server side cursors, and pessimistic locking, there will be a more difficult transition path to follow.
ADO.NET provides the possibility of de-coupling two programmers’ roles. The client layer deals with data with little or no knowledge of back-end implementation. Middle layer programmers will perform connected operations on behalf of the client applications, such as filling a DataSet or updating the backend database according to changes applied to a DataSet.
Something to unlearn is the concept of “browsing,” as in ADO.NET there is no such concepts of MoveNext, MovePrevious, and so on. These operations have been part of the programmer’s life since dBase times, and now we need to learn how to deal with “collections” of rows instead.
As ADO.NET greatly integrates with XML, and with the arrival of SQLXML 3.0, database developers should learn about XML as much as they can. It is not necessary to be an expert on XML to be a good database developer, but it won’t hurt you, and you’ll greatly benefit from this knowledge.
What do Database Administrators (DBAs) need to know about ADO.NET in order for them to get the most of ADO.NET?
DBAs should work very closely with middle layer database programmers to understand the way they need to retrieve and update data in SQL Server, so they can build the most efficient indexing and locking strategies, user-defined functions, and stored procedures for them. Profiler is always a good friend for a conscious DBA, and in this case it will help a lot understanding the way ADO.NET interacts with SQL Server.
In some cases it would be interesting to consider the creation of indexed views as potential sources of performance boost, and the Index Tuning Wizard can help a lot on this topic.
I really believe that every DBA should learn as much as possible about ADO.NET, at least the connected layer, to stay alive in this dynamic market. And it is even more important to do so, having in mind what the next SQL Server release will bring in this line.
What are the biggest mistakes most developers make that negatively affect performance of their SQL Server-based applications? What can they do to avoid these mistakes?
In my opinion, the biggest mistake that most developers make when interacting with SQL Server is the “SELECT * FROM myTable” syndrome. If there’s no WHERE clause, something doesn’t smell right. And If there’s an “*” you are retrieving too much. Developers should only retrieve the columns they need from the rows they need to use at any given time. Another great problem is the misunderstanding of JOIN strategies. I’m always amazed by the overuse of LEFT OUTER JOIN. Whenever I see one of them, I ask the developer: “Are you really sure this is what you want to retrieve?” And what about badly-formed queries resulting in unexpected CROSS JOIN queries? A good understanding of SQL language would help them tremendously.
Another common problem is the use of costly server side cursors, avoiding efficient set operations; often due to lack of knowledge about the way relational databases work. Whenever I see a stored procedure with a Transact-SQL cursor, I try to remove it by using more efficient plain-vanilla set operations. In the last 9 years working with SQL Server, I found just a handful of cases where using cursors were perhaps the only solution to those particular problems.
Concurrency problems are another common source of “performance” problems. You’d be surprised about how many times a connection is waiting only for locked resources to become available. The user perceives this as a performance problem, when actually the server’s CPU is almost idle, just waiting for blocking locks to release their precious resources. Applying the right isolation level is key in this topic.
Security is another common missing point among developers. “If it doesn’t run as sa, it doesn’t run at all.” Have you ever heard this comment? How many database applications connect to SQL Server as sa and no password? Hackers know this, and they are extremely happy with this common practice.
However, perhaps the biggest mistake overall is lack of understanding about the data, its structure, and the way this particular database system manages data. I did mention earlier that a DBA should understand how ADO.NET works, and for the same reasons, a client side or middle layer programmer should have as much database knowledge as possible.