Monitoring Table Size Growth in SQL Server

General
In this article I introduce a simple process that saves tables
size information at points in time which latter allows for tracking down table
growth over time.

The process is useful in monitoring data growth over time and lets
you see what tables are growing rapidly in what databases and can also help in
estimating future growth and disk space requirements. In addition it can help
prevent unexpectedly running out of disk space, or if you are already in
that situation then quickly find  the table(s) that for some reason have grown
more than excepted and are the source to the problem.

In addition, I also show a stored procedure called sp_DataFiles that returns a report of disk space usage
information at the data file level.

Implementation

Run these three scripts:

TableSizeHistory.sql

sp_TableSize.sql

ExecuteDatabasesCommand.sql

The process is built on top of a stored procedure called
sp_TableSize which returns table level information such as row count and size.

Periodically, typically by a weekly SQL Server Agent job,  the procedure
will be executed against all databases using an INSERT…EXEC statement that saves the results into a database table.

Having done that we actually freeze and document size related
information at the table level which can now be analyzed and so the next and
final step  would be to retrieve the data.

  

Usage

There are two main scenarios in which you can benefit from the
process;

One scenario is simply viewing the report knowing what tables grow
in what size and that rate at which they grow.

The second scenario would be to execute the DataGrowth stored
procedure while passing the @Delta_MB input parameter a value that is not
expected to be reached so that it always returns an empty result set. Now, when
the day comes and this procedure returns data you know there was a growth in
data exceeding your predefined threshold and can trigger a notification email.

Summary of scripts used:

TableSizeHistory.sql – Creates the PerfDB database and the TableSizeHistory table

DataGrowth.sql – Analyzes and retrieves the captured data

ExecuteDatabasesCommand.sql – A wrapper over a database cursor

sp_DataFiles.sql – Returns size information at the data file level for all
databases

sp_TableSize.sql – Returns table level information for the current database


Download all the scripts
   

sp_TableSize
sp_TableSize is a stored procedure
created in the master database and marked as a system stored procedure. The
procedure works at the table level in the context of the current database, that
is the database where it is executed from and returns one row per table with
the information of the row count and disk space usage in units of MB sorted by
disk space usage in a descending order.

The procedure is handy and easy to use with no input
parameters required and the information returned is based on system meta data,
meaning you do not actually access any user object so there is no effect (no
shared locks and no IO) on user objects as opposed to issuing a COUNT(*) query.

Bare in mind that due to the above fact there may
be a slight(minor)inaccuracy with the row count figure but that is not
common and results from the way SQL Server’s Storage Engine keeps track of data
that was deleted and typically gets aligned with the accurate figure following
a clustered index rebuild.

Part of the code in sp_TableSize is combined in my
sp_helpindex2
stored procedure.

DataGrowth
This stored procedure analyzes and retrieves the data that
was saved to the TableSizeHistory table using the sp_TableSize stored
procedure and returns the disk space growth in units of MB per each table.

The code identifies the first and last occurrences of
each table in the requested time period as defined by the @StartTime and
@EndTime input parameters and calculates the differences between the two occurrences
so that the returned result set is the delta between these two values.

You can control the tables returned by the @Delta_MB
input parameter so that you get to see only tables of interest, i.e. tables
that grew more than a certain size.

The Delta_reserved_MB column is the sum of the
Delta_data_MB + Delta_index_size_MB and this is the actual disk size the table
uses. Don’t let the name confuse you, I only followed Microsoft’s terminology
used in their sp_spaceused system stored procedure.

The @Database parameter has a default value of 1 and
returns a second result set with information at the database level.

ExecuteDatabasesCommand
This procedure is actually a wrapper over a cursor that
uses the database name from sys.databases DMV.

I created this procedure in order to reuse my code so that
every time I need a cursor over databases I use it instead of recoding that
cursor part. i.e. when I run an index rebuild process that uses my sp_Reindex
procedure, DBCC commands for integrity checks, attach, detach etc.

sp_DataFiles
This stored procedure is not related to the data growth process but
is a very useful disk space related tool.

sp_DataFiles works at the database level and returns one row per each data
file on the SQL Server instance.

The procedure is actually built on top of the disk
usage report query issued by SSMS (SQL Server Management Studio) which I
captured in Profiler a couple of years a go and adjusted a little for my needs. Being able to get that report by
executing a procedure is far more convenience and efficient than going through
the entire process required via the UI until finally receiving that desired
report and also, the SSMS report works on a specific database and when you need
to get a more wider picture of how your disk space is consumed by your
databases it is not enough.

The procedure makes use of the undocumented
DBCC command SHOWFILESTATS which returns the total extents and the used
extents information per data file and joins that information with the
sys.data_spaces DMV (Dynamic Management View).

All the disk space figures are calculated and derived
from the extent information returned by the DBCC command while the DMV only
contributes the file group name to the returned results.

DBCC SHOWFILESTATS returns one row per each data file in the database and sys.data_spaces returns one row per each data
space (which can actually be  a filegroup, partition scheme, or FILESTREAM data
filegroup).

The procedure also returns the space reserved and space used
information which is very useful. When these two figures are far from each other
(i.e. a data file that stores 10GB of data but reserves 40GB) there is space
that can be reclaimed to the operating system if needed. 

On the other hand when these two figures are close to
each other it points out that the file Auto Grow option is kicking in and that
of course is something we usually want to avoid.

 

Continues…

Pages: 1 2




Array

No comments yet... Be the first to leave a reply!