When you follow the different communities dedicated to Microsoft SQL Server, you cannot help noticing many questions about if it is somehow possible to view or audit the content of the log file. Such questions are good candidates for a FAQ and the reasons for the questioners to ask are manifold. Probably someone mistakenly deleted the wrong row or deleted or modified more rows than planned. Sometimes even complete objects were dropped by accident and now they need to be recovered. The answer to such questions is more or less always the same: “SQL Server does not offer any documented built-in mechanisms to view the content of the log file or recover specific rows from the log. You need to use a third-party tool.”. One of these specialised tools is ApexSQL Log, which is the subject of my discussion here.
The purpose of this whitepaper is to explore how ApexSQL Log can assist you, as a DBA, in your (hopefully not!) daily work when you get confronted with questions like the above mentioned. I intentionally say “hopefully not”, as the ApexSQL Log tool should clearly be seen as what it is and what it is designed for: It is a tool of last resort!
Yet useful and necessary, doesn’t it absolve you from your responsibility to implement a suitable permission concept for the users of your databases. Nor does it save you from implementing a proper backup strategy and probably even a disaster recovery plan. This is the first of a series of two whitepapers. In this paper at hand I will discuss the use of ApexSQL Log in an “optimal” OLTP environment. The second part will then deal with case studies in a suboptimal OLTP environment.
Installation, System Requirements, supported SQL Server versions
You can download the software from the ApexSQL homepage at http://www.apexsql.com and explore it for a fully-functional 30 days trial. After buying the tool, you need to activate it. This can either be done automatically or manually. Licensing is necessary on a per-server basis; the activation process needs to be done for each instance you want to monitor on the given server.
The whole installation process is straight-forward and simple to handle. One thing to be aware of, however, is that ApexSQL Log needs to install on the first connect to a new server some “server-side components”, which actually means that several objects will be created in your master database. Basically that’s a stored procedure and several extended stored procedures. They are required by the software to function and mandatory to be placed in master as this is the only place SQL Server allows to place custom extended stored procedures.
The system requirements are as follows:
ADO 2.7 (or higher)
Windows NT 4.0 SP3+
PC at 233 MHz (or faster). Pentium III recommended
256 MB RAM (512 RAM recommended)
Hard Disk Drive Space:
An Introduction to the SQL Server Transaction Log
The Microsoft SQL Server transaction log is obviously among the least understood, or most often misunderstood parts of the whole product. There seems to be a great uncertainty among the users about the proper treatment of the log. So, before we actually deal with the log, we first need some basic understanding what the log is (and what it’s not) and what its purpose is.
What’s the log?
Simply put consists the transaction log of one or more physical files that are associated with just one database. Each database always has such a transaction log associated with it and no two databases can share the same log. Even if you feel, you don’t need the log at all for your needs; you still have to deal with it. There is no way to create a database without a log, nor is there a way to either delete or drop the log after the database has been created. Don’t bother trying to find a workaround. You’re likely to render your database unusable and waste your time.
So, since you now know that there is always a log for a given database, what then is the purpose of this log?
The transaction log records all the changes made to a database and, thus, gives SQL Server all the information needed to recover the database. Changes are written to the log before they are written to the database. Recovery means the process of bringing the log and the database in synch with each other. If an entry in the log file indicates that this change has been marked committed it must necessarily also appear in the database. Likewise, any entry not marked as committed in the log should also not appear in the database. Recovery always happens when an instance of SQL Server is started or when a RESTORE request has explicitly been made.
So, all changes made to the database are recorded in the log? What changes exactly?
Well, that depends on the Recovery Mode a database is in, but for the purpose of my discussion here, we will identify in the next chapter that the “typical” OLTP database should be “Full Recovery Mode” and therefore the log records not only changes to the data made by INSERT, UPDATE, or DELETE operations, but also BULK INSERT and BCP operations and some DDL operations like CREATE INDEX. The log actually records a whole lot more information, but it is definitely beyond the scope of this paper to mention or even discuss them all. If you are interested in further in-depth coverage of this topic, I would suggest you get your hands on the “Inside SQL Server 2005: The Storage Engine” book by SQL Server MVP Kalen Delaney as this is undoubtebly the ultimate book on the internals of SQL Server and covers the transaction log in great detail. Another very informative source of information are 2 whitepapers about SQL Server’s I/O basics, which you can find at:
General recommendations regarding log treatment and backup strategies
The following recommendations are largely based on a note I got from a cyberspace friend of mine, Jonathan van Houtte. I think they sum it up very well.
- Determine the fault tolerance of your databases. That is, how much data loss your company is willing to bear. Have that Service Level Agreement written down and accepted by all involved parties.
- OLTP production database should almost never be run in Simple Recovery Mode. There are exceptions, but for the vast majority of databases this holds true. The only “advantages” of this setting are:
- You don’t have to fully understand the log and how it works.
- You’re less likely to run out of disk space.
- Performance problems caused by the running truncate process.
- You are not able to recover to a point in time or the point of failure. So, in case of a disaster, data loss is inescapable.
- You need to truncate the transaction log yourself. No Full or Differential backup operation will do this for you.
- Backing up the log can be done separately from other backups and they can independently be restored.
- If disaster strikes, the very first thing you should do, is try to attempt to backup the log. Reads this again. It might save you someday.
- Don’t shrink the transaction log file(s) as a matter of course. The log file(s) should be sized to the maximum amount of logged information that may occur between two backups. Add a percentage of that size as some sort of contingency reserve for unexpected operations. If you shrink the file(s), they will need to grow again. The auto growth process is a relative expensive one and can turn into a performance bottleneck, thus hurting overall performance. Additionally the growth will increase file fragmentation. Likewise, you should size a database to its estimated size after some time in production. If you predict that size to be XX GB, give the db that much size right when you create it.
- Make sure, you are able to restore and recover your database to its most recent state. Don’t pretend that you can do so, prove that you can! This is very likely to be the single most important task in your job as a DBA and you should be able to recite the necessary steps that need to be performed without any hesitation. Schedule such an exercise in regular intervals (every XX months, or as soon as you change anything in the backup process).
- It is certain that your hardware will fail. The only question that remains is, when this will happen.
- Make sure, you have, at least, two independent restore choices, before you go on and delete or overwrite any backups.
- If you’re using write-caching disk controllers, their memory should be redundant and they should be battery-backed. Verify that you understand how to recover their cache to disk before trying to find it out the hard when disaster strikes. Disable write-caching on the disks.
- The transaction log should be on its own physical disk, not on the same array as the data. Because of the sequential nature RAID 1 is a good choice for log files.