Parallel Statistics Update

Recently in my own practice, I was abruptly reminded of the value of SQL Server’s Update Statistics with Full Scan: to make a long story short, we had a system that was getting a bad query plan that would consistently peg one of the CPUs for hours at a time. If multiple requests for the same data came in, more CPUs would be taken up, eventually using 100% of all processors. The query was a simple one, and we were stumped by this issue for a time. It turned out that the maintenance on the database in question just had too low a sampling rate for Update Statistics, which had the effect of causing the optimizer to choose a bad plan. Changing to Update Statistics with Full Scan solved the issue.

However, full scan statistics update can be slow and expensive on a large system, and a single update statistics job for a whole database can easily exceed an overnight maintenance window. This gave me the idea to create jobs that could update statistics on the system in parallel, as follows.

Note: this technique is specifically designed for a manual statistics update, and might not be appropriate on all systems. For example, if you already reindex regularly, then the statistics for the reindexed objects may automatically be updated, and there is no need to update statistics separately.

The idea here is to make all subsystems work on your server to update as many statistics objects as possible, within a defined maintenance window, in order from the worst cases to the best. The hope is that the process can run nightly, and will chip away at all the indexes and statistics objects in order. If the maintenance window runs out, it’ll stop and not interfere with the server’s performance during the day. When the next evening rolls around, it will begin again, but will work on the worst cases first, which hopefully will include any objects for which there was not time to update statistics the previous night. On a large system, the statistics should at least be updated every few nights, in a round-robin fashion.

The design consists of a collection of SQL Agent jobs: one is a “controller” that starts and stops the process, and the others are “workers” that perform the updates against collections of indexes. The controller establishes a queue of objects to be updated, and the workers pull items from that queue until they run out or the controller stops the process at the end of the maintenance time period.

Establish a Set of Indexes to Update

To implement this pattern, we begin with a select statement that contains all the indexes and statistics objects needed, with data about their status and size:

select
schemas.name as table_schema,
tbls.name as table_name,
i.name as index_name,
i.id as table_id,
i.indid as index_id,
i.groupid,
i.rowmodctr as modifiedRows,
(select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2) as rowcnt,
stats_date( i.id, i.indid ) as lastStatsUpdate,
‘False’ as Processed
into ##updateStatsQueue
from sysindexes i
inner join sysobjects tbls on i.id = tbls.id
inner join sysusers schemas on tbls.uid = schemas.uid
inner join information_schema.tables tl
on tbls.name = tl.table_name
and schemas.name = tl.table_schema
and tl.table_type=’BASE TABLE’
where 0 < i.indid and i.indid < 255
and table_schema <> ‘sys’
and i.rowmodctr <> 0
and (select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2) > 0

This statement will query mainly the sysindexes table to create a list of all indexes and statistics objects, for non-system tables, with each object’s last statistics update time, the number of rows in the underlying table, and the number of rows modified since the last statistics update. The results are restricted only to those objects that have changed since the last statistics update, by choosing only rows where sysindexes.rowmodcounter is nonzero. (Note that this value can be negative in some cases, so I am deliberately using “<> 0.”)











There is one tricky bit to this: objects that are strictly statistics — as opposed to indexes — do appear in sysindexes, but always with a row count (sysindexes.rowcnt) of zero; sysindexes does not report the row count in the underlying table. In order to prioritize the queue I am generating, I would like to be able to compute how many modified rows there are out of the total number of rows in the underlying table, that is rowmodctr / rowcnt. However, we’d get either nonsense or a divide by zero error for objects that are statistics-only, since rowcnt is always zero. For that reason, I have this subquery:

select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2

This will transform the row count for the statistics objects into the row count for the underlying table.

Continues…

Leave a comment

Your email address will not be published.