Controlling Growth of a Distribution Database

was recently asked to set up Transactional Replication comprising of 29
publishers as well as same number of subscribers. One interesting challenge was
that the size of the distribution database suddenly increased from few MB’s to
around 62 GB and as a result the disk on which the production database was
hosted quickly filled up. The DML operations on the tables on the replicated
database was continuous. After investigating this, I noticed there was a table
named MSRepl_Commands present in the distribution database consisting
of a large number of records which was the major culprit for the drastic
increase in size. Going deeper, I also found that a job named Distribution
clean up: distribution
was taking a considerably long time to execute (i.e.
around 4 hrs and 35 minutes) which was very strange.

There are 2 jobs in SQL Server
which are responsible for the cleanup of tables present in the distribution

  1. Agent history cleanup: distribution
  2. Distribution clean up:distribution

two jobs can be found by expanding the Jobs folder present under SQL Server
Agent node in the SQL Server Management Studio.

job named Distribution clean up: distribution itself has one stored
procedure named sp_MSdistribution_cleanup executing internally which
had two parameters , namely @min_distretention and @max_distretention.

decided to manually execute this stored procedure against the distribution
database by specifying the values of @min_distretention and @max_distretention
using the syntax below.

dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 72

Before executing the stored procedure, I manually disabled both the above jobs
otherwise there would be a possibility of some deadlock occurring.

stored procedure took about 40 minutes to execute, after executing it output the below message:

Removed 0 replicated
transactions consisting of 8771 statements in 2171 seconds (4 rows/sec)

The main issue why the
distribution database was getting larger and larger in size was that the clean
up job was not removing the replicated commands from MSRepl_Commands
table. This table was around 26 GB in size with around 6.5 million records in

I then had a look at the
retention period of the publisher and identified that it was set to Never
Expire, as per the below  screen capture:

This was applicable to all the
29 publishers.

I then changed the retention
period as 480 hours i.e. 20 days and then executed the stored procedure named sp_MSdistribution_cleanup
against the distribution database as shown below:

dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 480

Once the above query was
executed, the output was as follows:

I then checked the MSRepl_Commands
table and found that the number of rows decreased from 6.45 million to 2.5 million.
I then manually shrunk the mdf file of the database involved in replication by
around 28 GB and the free space available on the disk grew to around 54 GB. I
also changed the recovery model of the database to Simple to minimize the log
file growth. The cleanup job now takes a few minutes to execute. After
performing these tasks, to date there hasn’t been any issue with the size of
the distribution database and the replication as a whole is working fine.


No comments yet... Be the first to leave a reply!