Peter Gulutzan and Trudy Pelzer are the co-authors of SQL Performance Tuning, a new book on how to performance tune SQL code. In this interview, the authors tell us a little about their book and offer some advice for DBAs and SQL developers.
Please tell us about yourselves.
PETER GULUTZAN: I worked for a few years for a computer manufacturer, then was a systems analyst for a trust company, then started a company to develop software packages for libraries. At that time, PCs were just becoming available, so there wasn’t a lot of software around. Before I could write my first useful library application, I needed a good DBMS. Since there weren’t any, I wrote my own.
TRUDY PELZER: I joined that company (Ocelot) while it was still in the library business. My specialty was software testing, so I became responsible for ensuring the new DBMS worked properly. Early on, we decided to adopt SQL as the basis for our product, so I set about familiarizing myself with the ANSI SQL Standard; it was still just in draft form then. Along the way, we both realized we could do some writing and we co-wrote three other books before we did SQL Performance Tuning. I’ve also written some reviews for ercb.com and Dr.Dobb’s, and Peter has several recent articles on dbazine.com.
Why did you decide to write this book?
TRUDY: It’s the only one we could write. Our specialty is SQL itself. In the field of DBMSs, we’re generalists — so writing about insider secrets is out. On the other hand, the market for introductory SQL books is past the saturation point. We pitched this idea to Addison-Wesley with the guarantee that it was unique.
PETER: We’ve worked on manufacturing DBMSs (not one of the ones in the book), so we know a lot about internals.
SQL Performance Tuning covers a variety of different databases, not just SQL Server. Now here’s a tough question, of the databases discussed in your book, does any one of them lend itself, over the others, as being easier to tune via SQL than the others.
TRUDY: Three vendors (Informix, Microsoft, and Oracle) have End User License Agreements which prohibit us from publishing specific comparative benchmark results about them. For example, the Informix EULA says: “… user may not disclose benchmark or other performance test results without IBM’s prior written consent.”
Looking at the others … MySQL seemed most affectable, Sybase seemed the least. I’ll emphasize that’s based on vague recollections of something we weren’t trying to measure, so you might want to take this as a talking point or a bit of tavern gossip.
Some cars lack automatic transmissions, and they are more “tunable” for efficiency. Some would say that’s bad because there’s more hassle and more to learn, for small gains. I expect that they’re right, really. But we aren’t in the database era where everything has an automatic shift.
Which areas of database performance tuning do you feel has the most significant affect on overall database performance?
PETER: In the book we appealed to authority. We quoted both Kevin Kline (co- author of Transact-SQL Programming) and Thomas Kyte (author of Expert One on One: Oracle). Both those sources said the same thing: 80% of the gain comes from tuning the SQL.
First of all, let’s distinguish as always between “response time” and “throughput.” “Response time” is the time elapsed between when the user hits “send” and when the screen fills with data. Obviously we can improve “response time” by getting a faster local computer, increasing the network bandwidth, and rewriting the client application in assembly language. But all those improvements have minimal effect on the DBMS engine’s performance, and that’s the most likely bottleneck because often there’s only one DBMS engine for many clients and many requests. Therefore we measure “throughput” instead, which is the number of queries that can be processed in a time period.
Now, it’s true that doubling the speed of the server hardware will double the throughput, if there are no locks, and there are always requests waiting for service, and task switching is free, etc. But we avoid hardware talk in the book. It’s reasonable to assume that a shop will either have state-of-the-art equipment (in which case hardware cannot be improved), or else it has decided to be cheap (in which case hardware cannot be improved). So, we started our book with the assumption that application programmers will have to learn how to use what they’ve got.
Among database configuration settings, the ones like “how much memory to allocate for a sort” or “how often a flush should occur” are clearly vital. They are also the ones that vary most wildly from one DBMS to another, or even from one version of a DBMS to another. So a book like SQL Performance Tuning won’t replace a book like Inside Microsoft SQL Server 2000, which is specific and specialist.