Using Asynchronous Statistics Updating in SQL Server 2005
Most DBAs are familiar with the AUTO_UPDATE_STATISTICS database option. Under most conditions, this feature provides the Query Optimizer with up-to-date index and column statistics so that optimum query plans can be created to execute queries.
But what many DBAs don’t know is how this option affects the performance of queries. For example, the AUTO_UPDATE_STATISTICS option, when turned on (which it is by default for a newly created user database), tells the Query Optimizer to automatically update the index and column statistics as data in the table changes. This is important because the distribution of data in rows can affect how the Query Optimizer identifies an optimum query plan. Statistics are automatically updated whenever the statistics used in a query execution plan fail an internal test for current statistics. This test is somewhat complex, but essentially, statistics are considered outdated when 20% or more of the rows in a table have changed.
So when the Query Optimizer determines that some index or column statistics need to be updated for a particular query, it updates them, right then and there. Once the statistics update is done, the query is recompiled using the new statistics, a new query plan is created and the query executed. In most cases, this is an invisible process to everyone. But not always.
In some cases, especially for very large tables that have a lot of indexes and column statistics on it, the statistics update process can take many seconds. During this time, the query has to wait until the statistics updating is complete before it is recompiled and executed with the new statistics information. This delay can result in two possible outcomes. First, query performance may seem to be erratic. For example, most of the time the query may take less than a second, but periodically it might take 15 seconds to run. And if the application running the query has a very short timeout time, it is possible for the application to time out before the statistics have been updated and the query actually executed.
Another thing to keep in mind is that when statistics are updated, there may or may not be any affect on the query plan. More often than not, when statistics are updated, there is not enough change in the statistics to affect how the Query Optimizer optimizes the query plan. But in some cases the update is critical, as the data has changed enough so that it does affect what query plan the Query Optimizer chooses. Since there is no way to easily know this, the assumption is to play it safe and update statistics whether or not they need to be updated.
Again, in most cases, the above occurrence is not a big problem. But in some cases, it can be. In SQL Server 2005, there is a new feature that overcomes this problem. It is a new database option called AUTO_UPDATE_STATISTICS_ASYNC. If this option is turned on for a database, and the AUTO_UPDATE_STATISTICS option is also turned on for the database, then the behavior of the statistics updating process changes. Now, instead of the query having to wait until the statistics are updated to execute, the query runs immediately using the old statistics and query plan. And in the background, a thread is spawned that performs the actual statistics update. When the statistics have been updated, they are now available the next time a query is run that needs them.
What this accomplishes is to make the performance of affected queries more predictable, and potentially eliminate application timeouts due to long running statistics updates. But keep this in mind: it is also possible that this could make the query run even longer, not shorter. For example, if the rows in the table have changed so much that statistics need to be updated, then it is possible that using the old statistics and old query plan to run the query now, instead of waiting, could cause the query to take much more time because the old query plan and statistics are way too out of date, and a new, more efficient query plan, based on the new statistics, is needed.
This is virtually impossible to predict. Because of the potential possibility that outdated statistics and query plans could make queries run slower, Microsoft only recommends using AUTO_UPDATE_STATISTICS_ASYNC in SQL Server 2005 under two conditions:
- When it is more critical that queries perform with a predictable response time than to run less efficient query plans.
- When you have an application that times out because of waiting for statistics to be automatically updated.
If you don’t experience either of the above, then you should not use the AUTO_UPDATE_STATISTICS_ASYNC option.
The AUTO_UPDATE_STATISTICS_ASYNC option is an all or nothing proposition. It is turned on and off at the database level, and affects all statistics updates equally. To turn this feature on, use:
ALTER DATABASE AUTO_UPDATE_STATISTICS_ASYNC ON
And to turn it off, use:
ALTER DATABASE AUTO_UPDATE_STATISTICS_ASYNC OFF
Another new related feature of SQL Server 2005 is the sys.dm_exec_background_job_queue dynamic management view. This table lets you view whether any statistics are queued for updating, or are in the process of updating. In addition, you can use the sys.dm_exex_background_job_queue_stats dynamic management view to see the aggregate statistics of all background job queues. These may be useful to see how using the AUTO_UPDATE_STATISTICS_ASYNC option affects the overall performance of your database.
The AUTO_UPDATE_STATISTICS_ASYNC database option, like so many options in SQL Server, has tradeoffs. If you think this option could help you out in a particular situation, test it first on a test system. You don’t want to experiment on your production system.