Site sponsored by: Idera Try Idera’s new SQL admin toolset
SQL Server Performance

  • Home
  • Articles
  • Forums
  • Tips
  • Quiz
  • FAQ's
  • Blogs
  • Software
  • Books
  • About Us
RSS Feeds
Sign in | Join


Article Topics

All Articles
Peformance Tuning
Audit
Business Intelligence
Clustering
Reporting Services
Developer
General DBA
ASP.NET / ADO.NET

SQL Server 2008 - Worth the Wait

SQL Server’s first significant upgrade in three years features a number of envelope-pushing enhancements and improvements. Which will have the greatest impact on SQL administration and development? More...
Latest Articles

Slowly Changing Dimensions in SQL Server 2005
Audit Data Modifications
SQL Server 2008’s Management Data Warehouse
Same Report but Different Methods in SQL Server Reporting Services ...

More     
 
Latest FAQ's

How to Integrate Performance Monitor and SQL Profiler
SSIS Lookups are Case Sensitive
Convert Number to Words in SSRS
After installing SP2 on SQL Server 2005 x64, when trying to ...

More     
   
Latest Software Reviews

SQL Server DBA Dashboard
SwisSQL DBChangeManager
SQLMesh - SQL Server Search Tool
SoftTreeTech SQL Assistant

More     

articles >> peformance tuning >> Parallel Statistics Update

Parallel Statistics Update

By : Merrill Aldrich
Apr 30, 2007
Printer friendly

Recently in my own practice, I was abruptly reminded of the value of SQL Server's Update Statistics with Full Scan: to make a long story short, we had a system that was getting a bad query plan that would consistently peg one of the CPUs for hours at a time. If multiple requests for the same data came in, more CPUs would be taken up, eventually using 100% of all processors. The query was a simple one, and we were stumped by this issue for a time. It turned out that the maintenance on the database in question just had too low a sampling rate for Update Statistics, which had the effect of causing the optimizer to choose a bad plan. Changing to Update Statistics with Full Scan solved the issue.

However, full scan statistics update can be slow and expensive on a large system, and a single update statistics job for a whole database can easily exceed an overnight maintenance window. This gave me the idea to create jobs that could update statistics on the system in parallel, as follows.

Note: this technique is specifically designed for a manual statistics update, and might not be appropriate on all systems. For example, if you already reindex regularly, then the statistics for the reindexed objects may automatically be updated, and there is no need to update statistics separately.

The idea here is to make all subsystems work on your server to update as many statistics objects as possible, within a defined maintenance window, in order from the worst cases to the best. The hope is that the process can run nightly, and will chip away at all the indexes and statistics objects in order. If the maintenance window runs out, it'll stop and not interfere with the server's performance during the day. When the next evening rolls around, it will begin again, but will work on the worst cases first, which hopefully will include any objects for which there was not time to update statistics the previous night. On a large system, the statistics should at least be updated every few nights, in a round-robin fashion.

The design consists of a collection of SQL Agent jobs: one is a "controller" that starts and stops the process, and the others are "workers" that perform the updates against collections of indexes. The controller establishes a queue of objects to be updated, and the workers pull items from that queue until they run out or the controller stops the process at the end of the maintenance time period.

Establish a Set of Indexes to Update

To implement this pattern, we begin with a select statement that contains all the indexes and statistics objects needed, with data about their status and size:

select
schemas.name as table_schema,
tbls.name as table_name,
i.name as index_name,
i.id as table_id,
i.indid as index_id,
i.groupid,
i.rowmodctr as modifiedRows,
(select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2) as rowcnt,
stats_date( i.id, i.indid ) as lastStatsUpdate,
'False' as Processed
into ##updateStatsQueue
from sysindexes i
inner join sysobjects tbls on i.id = tbls.id
inner join sysusers schemas on tbls.uid = schemas.uid
inner join information_schema.tables tl
on tbls.name = tl.table_name
and schemas.name = tl.table_schema
and tl.table_type='BASE TABLE'
where 0 < i.indid and i.indid < 255
and table_schema <> 'sys'
and i.rowmodctr <> 0
and (select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2) > 0

This statement will query mainly the sysindexes table to create a list of all indexes and statistics objects, for non-system tables, with each object's last statistics update time, the number of rows in the underlying table, and the number of rows modified since the last statistics update. The results are restricted only to those objects that have changed since the last statistics update, by choosing only rows where sysindexes.rowmodcounter is nonzero. (Note that this value can be negative in some cases, so I am deliberately using "<> 0.")


There is one tricky bit to this: objects that are strictly statistics -- as opposed to indexes -- do appear in sysindexes, but always with a row count (sysindexes.rowcnt) of zero; sysindexes does not report the row count in the underlying table. In order to prioritize the queue I am generating, I would like to be able to compute how many modified rows there are out of the total number of rows in the underlying table, that is rowmodctr / rowcnt. However, we'd get either nonsense or a divide by zero error for objects that are statistics-only, since rowcnt is always zero. For that reason, I have this subquery:

select max(rowcnt) from sysindexes i2 where i.id = i2.id and i2.indid < 2

This will transform the row count for the statistics objects into the row count for the underlying table.


    Next Page>>    








Home | Peformance Articles | Audit Articles | Business Intelligence Articles | Clustering Articles | Developer Articles | Reporting Services Articles | DBA Articles | ASP.NET / ADO.NET Articles | DBA FAQ's | Developer Peformance FAQ's | DBA Peformance FAQ's | Developer FAQ's | Clustering FAQ's | Error Messages | Audit Tool Reviews | Backup Tool Reviews | Coding Tool Reviews | Compare Tool Reviews | Documentation Tool Reviews | Design Tool Reviews | Monitoring Tool Reviews | Log Tool Reviews | Reporting Tool Reviews | Clustering Tool Reviews | Security Tool Reviews | Change Management Tool Reviews | Remote Access Tool Reviews | Book Reviews | Security Tool Reviews | QDPMA Performance Tuning | ADO.NET / ASP.NET | Administration | Analysis/OLAP Services | Application Development | Configuration | Components | ETL | Hardware | High Availability | Hints | Index | Misc | Operating Systems | Performance Tuning | Replication | T-SQL | Views