Transactional Replication – Unable to Replicate Data onto the Subscriber – A Practical Example

Scenario: A couple of days back, I got a call from
one of my users saying that the Data is not being replicated properly onto
the Production Reporting Server from the Production OLTP server.

We have Transactional Replication scheduled for every 15 minutes between our OLTP and
the Reporting server.

Approach:

I started investigating this issue by first logging on to
the Production OLTP server and checking the following parameters:

  1. I
    explored the Activity Monitor in SSMS and
    checked whether there is any sleeping process which has consumed the majority of
    the CPU or I/O but I found no issues here.
  2. I
    then examined the long running queries using the Standard Reports present in
    SSMS and found that the CPU utilization for those
    queries was not very high.

  3. I
    then looked at the status of the Replication Log Reader agent
    and found that it was capturing the Replicated Commands. The
    message viewed in the Log Reader agent was as below:
  4. 4 transactions with 75 commands were generated. 
  5. After
    performing Step 3, I looked at the synchronization status
    of the subscriber and found that there was an issue with it. Below is the message which
    was shown in the synchronization status:
  6. The process is running and is waiting for a response from the server. 
  7. I
    then looked for any blocking processes on the OLTP
    server where the publisher database is hosted. To do this, I
    executed the below T-SQL query against the master database on the OLTP server.
  8. Select * from sysprocesses
  9. After executing this query, I found
    that under the column named blocked all the values were 0 which meant
    that there was no blocking on the OLTP server.
  10. I
    then investigated whether the distib.exe file on the OLTP server
    was causing any performance issues but noted that were no
    issues with it.
  11. In addition, there were no issues with respect to the N/W speed. Both the OLTP as
    and Reporting servers were located in the same Data Center.

After performing all the above steps, I was 100% sure that
there were no issues on the OLTP server and the issue was almost certainly on the
Reporting server.

I then logged on to the Reporting Server and performed the
following checks:

  1. I first checked the synchronization status of the subscriber and found
    that it wasn’t receiving any data and the message shown was:
  2. The process is
    running and is waiting for a response from the server.
  3. I
    then explored the Activity Monitor and found that there were a few processes in
    which were sleeping and whose CPU as well as IO utilization were high, I
    therefore kill these processes.
  4. After
    performing the step 2, I then had a look at the Long Running queries report
    under the Standard Reports in SSMS and found no issues.
  5. I
    then decided to have a look at the blocking by executing the below query
    against the master database.
  6. Select * from sysprocesses
  7. Once the above query was executed, I
    noticed that under the blocked column there were three values named 65,61 and
    60 which were the SP IDs of the blocking processes. On further investigation it
    was found that these were the system processes. I decided to KILL them by
    issuing the following command against the master database.

    KILL 60 
    KILL 61 
    KILL 65
    
  8. After executing this query, I again
    executed the below query against the master database to check whether there was
    still any blocking.
  9. Select * from sysprocesses
  10. Once the above query got executed under the
    column named blocked all the values were 0 which indicated that there was no blocking.
  11. I then had a look at the synchronization
    status of the subscriber and found that the data had started synchronizing well.
    The message viewed were as follows:
  12. 110 Transactions with 2295 Commands were delivered.
    57 Transactions with 8767 commands were delivered.
  13. I then had a look at the Replication Activity Monitor and found that the latency period shown was 8 hours and 57
    minutes which meant that the data last was replicated 8 hours and 57 minutes
    ago. Once the synchronization started well it just took around 50 minutes for both the servers to get back into sync.

Thanks for taking the time to read this, please let me know if you have any questions or comments. My sincere thanks to my
colleagues Arti Samkaria, Sharanjeet Singh Sansoya as well as Khushboo Jatav.




Array

5 Responses to “Transactional Replication – Unable to Replicate Data onto the Subscriber – A Practical Example”

  1. Is it worth to kill sleeping processes unless they were participating in blocking(open tran,etc)?

  2. Nice case study.

    Would it be worth to take a look at the transactions at subscriber to check as to why they are blocked. Were they insert statements and if so asking why subscribers are getting updated in transactional replication.

    Capturing SQL Text by using DMV or using Activity to check queries which were blocked would have been helpful.

    I know in situations like this the first thing anybody/dba would think of restoring replication as soon as possible but I think it would be worth checking as to why this issue was caused.(Root Cause Analysis).

    Good case study. :)

  3. Hi Satnam,

    Its a usefull article. Small correction you have missed the numbering. Under OLTP points, the point 2 should be 3 i believe.

    Thanks

  4. Was really helpful.

    Thanks bro :)

  5. Thank you so much for sharing this awesome article. Nice to read “Thanks for taking the time to read this”.

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |